unregister_key_type() has code to mark a key as dead and make it unavailable in
one loop and then destroy all those unavailable key payloads in the next loop.
However, the loop to mark keys dead renders the key undetectable to the second
loop by changing the key type pointer also.
Fix this by the following means:
(1) The key code has two garbage collectors: one deletes unreferenced keys and
the other alters keyrings to delete links to old dead, revoked and expired
keys. They can end up holding each other up as both want to scan the key
serial tree under spinlock. Combine these into a single routine.
(2) Move the dead key marking, dead link removal and dead key removal into the
garbage collector as a three phase process running over the three cycles
of the normal garbage collection procedure. This is tracked by the
KEY_GC_REAPING_DEAD_1, _2 and _3 state flags.
unregister_key_type() then just unlinks the key type from the list, wakes
up the garbage collector and waits for the third phase to complete.
(3) Downgrade the key types sem in unregister_key_type() once it has deleted
the key type from the list so that it doesn't block the keyctl() syscall.
(4) Dead keys that cannot be simply removed in the third phase have their
payloads destroyed with the key's semaphore write-locked to prevent
interference by the keyctl() syscall. There should be no in-kernel users
of dead keys of that type by the point of unregistration, though keyctl()
may be holding a reference.
(5) Only perform timer recalculation in the GC if the timer actually expired.
If it didn't, we'll get another cycle when it goes off - and if the key
that actually triggered it has been removed, it's not a problem.
(6) Only garbage collect link if the timer expired or if we're doing dead key
clean up phase 2.
(7) As only key_garbage_collector() is permitted to use rb_erase() on the key
serial tree, it doesn't need to revalidate its cursor after dropping the
spinlock as the node the cursor points to must still exist in the tree.
(8) Drop the spinlock in the GC if there is contention on it or if we need to
reschedule. After dealing with that, get the spinlock again and resume
scanning.
This has been tested in the following ways:
(1) Run the keyutils testsuite against it.
(2) Using the AF_RXRPC and RxKAD modules to test keytype removal:
Load the rxrpc_s key type:
# insmod /tmp/af-rxrpc.ko
# insmod /tmp/rxkad.ko
Create a key (http://people.redhat.com/~dhowells/rxrpc/listen.c):
# /tmp/listen &
[1] 8173
Find the key:
# grep rxrpc_s /proc/keys
091086e1 I--Q-- 1 perm 39390000 0 0 rxrpc_s 52:2
Link it to a session keyring, preferably one with a higher serial number:
# keyctl link 0x20e36251 @s
Kill the process (the key should remain as it's linked to another place):
# fg
/tmp/listen
^C
Remove the key type:
rmmod rxkad
rmmod af-rxrpc
This can be made a more effective test by altering the following part of
the patch:
if (unlikely(gc_state & KEY_GC_REAPING_DEAD_2)) {
/* Make sure everyone revalidates their keys if we marked a
* bunch as being dead and make sure all keyring ex-payloads
* are destroyed.
*/
kdebug("dead sync");
synchronize_rcu();
To call synchronize_rcu() in GC phase 1 instead. That causes that the
keyring's old payload content to hang around longer until it's RCU
destroyed - which usually happens after GC phase 3 is complete. This
allows the destroy_dead_key branch to be tested.
Reported-by: Benjamin Coddington <bcodding@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
The dead key link reaper should be non-reentrant as it relies on global state
to keep track of where it's got to when it returns to the work queue manager to
give it some air.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Make the key reaper non-reentrant by sticking it on the appropriate system work
queue when we queue it. This will allow it to have global state and drop
locks. It should probably be non-reentrant already as it may spend a long time
holding the key serial spinlock, and so multiple entrants can spend long
periods of time just sitting there spinning, waiting to get the lock.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Move the unreferenced key reaper function to the keys garbage collector file
as that's a more appropriate place with the dead key link reaper.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
__key_link() should use the RCU deref wrapper rcu_dereference_locked_keyring()
for accessing keyring payloads rather than calling rcu_dereference_protected()
directly.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
The keyctl call:
keyctl_get_keyring_ID(KEY_SPEC_SESSION_KEYRING, 1)
should create a session keyring if the process doesn't have one of its own
because the create flag argument is set - rather than subscribing to and
returning the user-session keyring as:
keyctl_get_keyring_ID(KEY_SPEC_SESSION_KEYRING, 0)
will do.
This can be tested by commenting out pam_keyinit in the /etc/pam.d files and
running the following program a couple of times in a row:
#include <stdio.h>
#include <stdlib.h>
#include <keyutils.h>
int main(int argc, char *argv[])
{
key_serial_t uk, usk, sk, nsk;
uk = keyctl_get_keyring_ID(KEY_SPEC_USER_KEYRING, 0);
usk = keyctl_get_keyring_ID(KEY_SPEC_USER_SESSION_KEYRING, 0);
sk = keyctl_get_keyring_ID(KEY_SPEC_SESSION_KEYRING, 0);
nsk = keyctl_get_keyring_ID(KEY_SPEC_SESSION_KEYRING, 1);
printf("keys: %08x %08x %08x %08x\n", uk, usk, sk, nsk);
return 0;
}
Without this patch, I see:
keys: 3975ddc7 119c0c66 119c0c66 119c0c66
keys: 3975ddc7 119c0c66 119c0c66 119c0c66
With this patch, I see:
keys: 2cb4997b 34112878 34112878 17db2ce3
keys: 2cb4997b 34112878 34112878 39f3c73e
As can be seen, the session keyring starts off the same as the user-session
keyring each time, but with the patch a new session keyring is created when
the create flag is set.
Reported-by: Greg Wettstein <greg@enjellic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Greg Wettstein <greg@enjellic.com>
Signed-off-by: James Morris <jmorris@namei.org>
If install_session_keyring() is given a keyring, it should install it rather
than just creating a new one anyway. This was accidentally broken in:
commit d84f4f992c
Author: David Howells <dhowells@redhat.com>
Date: Fri Nov 14 10:39:23 2008 +1100
Subject: CRED: Inaugurate COW credentials
The impact of that commit is that pam_keyinit no longer works correctly if
'force' isn't specified against a login process. This is because:
keyctl_get_keyring_ID(KEY_SPEC_SESSION_KEYRING, 0)
now always creates a new session keyring and thus the check whether the session
keyring and the user-session keyring are the same is always false. This leads
pam_keyinit to conclude that a session keyring is installed and it shouldn't be
revoked by pam_keyinit here if 'revoke' is specified.
Any system that specifies 'force' against pam_keyinit in the PAM configuration
files for login methods (login, ssh, su -l, kdm, etc.) is not affected since
that bypasses the broken check and forces the creation of a new session keyring
anyway (for which the revoke flag is not cleared) - and any subsequent call to
pam_keyinit really does have a session keyring already installed, and so the
check works correctly there.
Reverting to the previous behaviour will cause the kernel to subscribe the
process to the user-session keyring as its session keyring if it doesn't have a
session keyring of its own. pam_keyinit will detect this and install a new
session keyring anyway (and won't clear the revert flag).
This can be tested by commenting out pam_keyinit in the /etc/pam.d files and
running the following program a couple of times in a row:
#include <stdio.h>
#include <stdlib.h>
#include <keyutils.h>
int main(int argc, char *argv[])
{
key_serial_t uk, usk, sk;
uk = keyctl_get_keyring_ID(KEY_SPEC_USER_KEYRING, 0);
usk = keyctl_get_keyring_ID(KEY_SPEC_USER_SESSION_KEYRING, 0);
sk = keyctl_get_keyring_ID(KEY_SPEC_SESSION_KEYRING, 0);
printf("keys: %08x %08x %08x\n", uk, usk, sk);
return 0;
}
Without the patch, I see:
keys: 3884e281 24c4dfcf 22825f8e
keys: 3884e281 24c4dfcf 068772be
With the patch, I see:
keys: 26be9c83 0e755ce0 0e755ce0
keys: 26be9c83 0e755ce0 0e755ce0
As can be seen, with the patch, the session keyring is the same as the
user-session keyring each time; without the patch a new session keyring is
generated each time.
Reported-by: Greg Wettstein <greg@enjellic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Greg Wettstein <greg@enjellic.com>
Signed-off-by: James Morris <jmorris@namei.org>
Although the EVM encrypted-key should be encrypted/decrypted using a
trusted-key, a user-defined key could be used instead. When using a user-
defined key, a TCG_TPM dependency should not be required. Unfortunately,
the encrypted-key code needs to be refactored a bit in order to remove
this dependency.
This patch adds the TCG_TPM dependency.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>,
Randy Dunlap <rdunlap@xenotimenet>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
daemonize() is only needed when a user-space task does kernel_thread().
tomoyo_gc_thread() is kthread_create()'ed and thus it doesn't need
the soon-to-be-deprecated daemonize().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: James Morris <jmorris@namei.org>
Initialize has_cap in cap_bprm_set_creds()
Reported-by: Andrew G. Morgan <morgan@kernel.org>
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
A task (when !SECURE_NOROOT) which executes a setuid-root binary will
obtain root privileges while executing that binary. If the binary also
has effective capabilities set, then only those capabilities will be
granted. The rationale is that the same binary can carry both setuid-root
and the minimal file capability set, so that on a filesystem not
supporting file caps the binary can still be executed with privilege,
while on a filesystem supporting file caps it will run with minimal
privilege.
This special case currently does NOT happen if there are file capabilities
but no effective capabilities. Since capability-aware programs can very
well start with empty pE but populated pP and move those caps to pE when
needed. In other words, if the file has file capabilities but NOT
effective capabilities, then we should do the same thing as if there
were file capabilities, and not grant full root privileges.
This patchset does that.
(Changelog by Serge Hallyn).
Signed-off-by: Zhi Li <lizhi1215@gmail.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
evm_inode_init_security() should return 0, when EVM is not enabled.
(Returning an error is a remnant of evm_inode_post_init_security.)
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Select trusted and encrypted keys if EVM is selected, to ensure
the requisite symbols are available. Otherwise, these can be
selected as modules while EVM is static, leading to a kernel
build failure.
Signed-off-by: James Morris <jmorris@namei.org>
Commit bd03a3e4 "TOMOYO: Add policy namespace support." forgot to set EOF flag
and forgot to print namespace at PREFERENCE line.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
My @hp.com will no longer be valid starting August 5, 2011 so an update is
necessary. My new email address is employer independent so we don't have
to worry about doing this again any time soon.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (54 commits)
tpm_nsc: Fix bug when loading multiple TPM drivers
tpm: Move tpm_tis_reenable_interrupts out of CONFIG_PNP block
tpm: Fix compilation warning when CONFIG_PNP is not defined
TOMOYO: Update kernel-doc.
tpm: Fix a typo
tpm_tis: Probing function for Intel iTPM bug
tpm_tis: Fix the probing for interrupts
tpm_tis: Delay ACPI S3 suspend while the TPM is busy
tpm_tis: Re-enable interrupts upon (S3) resume
tpm: Fix display of data in pubek sysfs entry
tpm_tis: Add timeouts sysfs entry
tpm: Adjust interface timeouts if they are too small
tpm: Use interface timeouts returned from the TPM
tpm_tis: Introduce durations sysfs entry
tpm: Adjust the durations if they are too small
tpm: Use durations returned from TPM
TOMOYO: Enable conditional ACL.
TOMOYO: Allow using argv[]/envp[] of execve() as conditions.
TOMOYO: Allow using executable's realpath and symlink's target as conditions.
TOMOYO: Allow using owner/group etc. of file objects as conditions.
...
Fix up trivial conflict in security/tomoyo/realpath.c
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
merge fchmod() and fchmodat() guts, kill ancient broken kludge
xfs: fix misspelled S_IS...()
xfs: get rid of open-coded S_ISREG(), etc.
vfs: document locking requirements for d_move, __d_move and d_materialise_unique
omfs: fix (mode & S_IFDIR) abuse
btrfs: S_ISREG(mode) is not mode & S_IFREG...
ima: fmode_t misspelled as mode_t...
pci-label.c: size_t misspelled as mode_t
jffs2: S_ISLNK(mode & S_IFMT) is pointless
snd_msnd ->mode is fmode_t, not mode_t
v9fs_iop_get_acl: get rid of unused variable
vfs: dont chain pipe/anon/socket on superblock s_inodes list
Documentation: Exporting: update description of d_splice_alias
fs: add missing unlock in default_llseek()
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
fs: Merge split strings
treewide: fix potentially dangerous trailing ';' in #defined values/expressions
uwb: Fix misspelling of neighbourhood in comment
net, netfilter: Remove redundant goto in ebt_ulog_packet
trivial: don't touch files that are removed in the staging tree
lib/vsprintf: replace link to Draft by final RFC number
doc: Kconfig: `to be' -> `be'
doc: Kconfig: Typo: square -> squared
doc: Konfig: Documentation/power/{pm => apm-acpi}.txt
drivers/net: static should be at beginning of declaration
drivers/media: static should be at beginning of declaration
drivers/i2c: static should be at beginning of declaration
XTENSA: static should be at beginning of declaration
SH: static should be at beginning of declaration
MIPS: static should be at beginning of declaration
ARM: static should be at beginning of declaration
rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check
Update my e-mail address
PCIe ASPM: forcedly -> forcibly
gma500: push through device driver tree
...
Fix up trivial conflicts:
- arch/arm/mach-ep93xx/dma-m2p.c (deleted)
- drivers/gpio/gpio-ep93xx.c (renamed and context nearby)
- drivers/net/r8169.c (just context changes)
For a number of file systems that don't have a mount point (e.g. sockfs
and pipefs), they are not marked as long term. Therefore in
mntput_no_expire, all locks in vfs_mount lock are taken instead of just
local cpu's lock to aggregate reference counts when we release
reference to file objects. In fact, only local lock need to have been
taken to update ref counts as these file systems are in no danger of
going away until we are ready to unregister them.
The attached patch marks file systems using kern_mount without
mount point as long term. The contentions of vfs_mount lock
is now eliminated. Before un-registering such file system,
kern_unmount should be called to remove the long term flag and
make the mount point ready to be freed.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (107 commits)
vfs: use ERR_CAST for err-ptr tossing in lookup_instantiate_filp
isofs: Remove global fs lock
jffs2: fix IN_DELETE_SELF on overwriting rename() killing a directory
fix IN_DELETE_SELF on overwriting rename() on ramfs et.al.
mm/truncate.c: fix build for CONFIG_BLOCK not enabled
fs:update the NOTE of the file_operations structure
Remove dead code in dget_parent()
AFS: Fix silly characters in a comment
switch d_add_ci() to d_splice_alias() in "found negative" case as well
simplify gfs2_lookup()
jfs_lookup(): don't bother with . or ..
get rid of useless dget_parent() in btrfs rename() and link()
get rid of useless dget_parent() in fs/btrfs/ioctl.c
fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers
drivers: fix up various ->llseek() implementations
fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek
Ext4: handle SEEK_HOLE/SEEK_DATA generically
Btrfs: implement our own ->llseek
fs: add SEEK_HOLE and SEEK_DATA flags
reiserfs: make reiserfs default to barrier=flush
...
Fix up trivial conflicts in fs/xfs/linux-2.6/xfs_super.c due to the new
shrinker callout for the inode cache, that clashed with the xfs code to
start the periodic workers later.
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: Fix wrong check in list_splice_init_rcu()
net,rcu: Convert call_rcu(xt_rateest_free_rcu) to kfree_rcu()
sysctl,rcu: Convert call_rcu(free_head) to kfree
vmalloc,rcu: Convert call_rcu(rcu_free_vb) to kfree_rcu()
vmalloc,rcu: Convert call_rcu(rcu_free_va) to kfree_rcu()
ipc,rcu: Convert call_rcu(ipc_immediate_free) to kfree_rcu()
ipc,rcu: Convert call_rcu(free_un) to kfree_rcu()
security,rcu: Convert call_rcu(sel_netport_free) to kfree_rcu()
security,rcu: Convert call_rcu(sel_netnode_free) to kfree_rcu()
ia64,rcu: Convert call_rcu(sn_irq_info_free) to kfree_rcu()
block,rcu: Convert call_rcu(disk_free_ptbl_rcu_cb) to kfree_rcu()
scsi,rcu: Convert call_rcu(fc_rport_free_rcu) to kfree_rcu()
audit_tree,rcu: Convert call_rcu(__put_tree) to kfree_rcu()
security,rcu: Convert call_rcu(whitelist_item_free) to kfree_rcu()
md,rcu: Convert call_rcu(free_conf) to kfree_rcu()
* 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc: (39 commits)
ptrace: do_wait(traced_leader_killed_by_mt_exec) can block forever
ptrace: fix ptrace_signal() && STOP_DEQUEUED interaction
connector: add an event for monitoring process tracers
ptrace: dont send SIGSTOP on auto-attach if PT_SEIZED
ptrace: mv send-SIGSTOP from do_fork() to ptrace_init_task()
ptrace_init_task: initialize child->jobctl explicitly
has_stopped_jobs: s/task_is_stopped/SIGNAL_STOP_STOPPED/
ptrace: make former thread ID available via PTRACE_GETEVENTMSG after PTRACE_EVENT_EXEC stop
ptrace: wait_consider_task: s/same_thread_group/ptrace_reparented/
ptrace: kill real_parent_is_ptracer() in in favor of ptrace_reparented()
ptrace: ptrace_reparented() should check same_thread_group()
redefine thread_group_leader() as exit_signal >= 0
do not change dead_task->exit_signal
kill task_detached()
reparent_leader: check EXIT_DEAD instead of task_detached()
make do_notify_parent() __must_check, update the callers
__ptrace_detach: avoid task_detached(), check do_notify_parent()
kill tracehook_notify_death()
make do_notify_parent() return bool
ptrace: s/tracehook_tracer_task()/ptrace_parent()/
...
The rcu callback sel_netport_free() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(sel_netport_free).
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Eric Paris <eparis@parisplace.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
The rcu callback sel_netnode_free() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(sel_netnode_free).
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Eric Paris <eparis@parisplace.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
The rcu callback whitelist_item_free() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(whitelist_item_free).
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: James Morris <jmorris@namei.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
pass mask instead; kill security_inode_exec_permission() since we can use
security_inode_permission() instead.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Permit changing of security.evm only when valid, unless in fixmode.
Reported-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
We will use digital signatures in addtion to hmac.
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@nokia.com>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
If EVM is not supported or enabled, evm_verify_hmac() returns
INTEGRITY_UNKNOWN, which ima_appraise_measurement() ignores and sets
the appraisal status based solely on the security.ima verification.
evm_verify_hmac() also returns INTEGRITY_UNKNOWN for other failures, such
as temporary failures like -ENOMEM, resulting in possible attack vectors.
This patch changes the default return code for temporary/unexpected
failures, like -ENOMEM, from INTEGRITY_UNKNOWN to INTEGRITY_FAIL, making
evm_verify_hmac() fail safe.
As a result, failures need to be re-evaluated in order to catch both
temporary errors, such as the -ENOMEM, as well as errors that have been
resolved in fix mode.
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@nokia.com>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Using shash is more efficient, because the algorithm is allocated only
once. Only the descriptor to store the hash state needs to be allocated
for every operation.
Changelog v6:
- check for crypto_shash_setkey failure
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@nokia.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Changelog v7:
- moved the initialization call to security_inode_init_security,
renaming evm_inode_post_init_security to evm_inode_init_security
- increase size of xattr array for EVM xattr
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Initialize 'security.evm' for new files.
Changelog v7:
- renamed evm_inode_post_init_security to evm_inode_init_security
- moved struct xattr definition to earlier patch
- allocate xattr name
Changelog v6:
- Use 'struct evm_ima_xattr_data'
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Imbed the evm calls evm_inode_setxattr(), evm_inode_post_setxattr(),
evm_inode_removexattr() in the security hooks. evm_inode_setxattr()
protects security.evm xattr. evm_inode_post_setxattr() and
evm_inode_removexattr() updates the hmac associated with an inode.
(Assumes an LSM module protects the setting/removing of xattr.)
Changelog:
- Don't define evm_verifyxattr(), unless CONFIG_INTEGRITY is enabled.
- xattr_name is a 'const', value is 'void *'
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
EVM protects a file's security extended attributes(xattrs) against integrity
attacks. The current patchset maintains an HMAC-sha1 value across the security
xattrs, storing the value as the extended attribute 'security.evm'. We
anticipate other methods for protecting the security extended attributes.
This patch reserves the first byte of 'security.evm' as a place holder for
the type of method.
Changelog v6:
- move evm_ima_xattr_type definition to security/integrity/integrity.h
- defined a structure for the EVM xattr called evm_ima_xattr_data
(based on Serge Hallyn's suggestion)
- removed unnecessary memset
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@nokia.com>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
EVM protects a file's security extended attributes(xattrs) against integrity
attacks. This patchset provides the framework and an initial method. The
initial method maintains an HMAC-sha1 value across the security extended
attributes, storing the HMAC value as the extended attribute 'security.evm'.
Other methods of validating the integrity of a file's metadata will be posted
separately (eg. EVM-digital-signatures).
While this patchset does authenticate the security xattrs, and
cryptographically binds them to the inode, coming extensions will bind other
directory and inode metadata for more complete protection. To help simplify
the review and upstreaming process, each extension will be posted separately
(eg. IMA-appraisal, IMA-appraisal-directory). For a general overview of the
proposed Linux integrity subsystem, refer to Dave Safford's whitepaper:
http://downloads.sf.net/project/linux-ima/linux-ima/Integrity_overview.pdf.
EVM depends on the Kernel Key Retention System to provide it with a
trusted/encrypted key for the HMAC-sha1 operation. The key is loaded onto the
root's keyring using keyctl. Until EVM receives notification that the key has
been successfully loaded onto the keyring (echo 1 > <securityfs>/evm), EVM can
not create or validate the 'security.evm' xattr, but returns INTEGRITY_UNKNOWN.
Loading the key and signaling EVM should be done as early as possible. Normally
this is done in the initramfs, which has already been measured as part of the
trusted boot. For more information on creating and loading existing
trusted/encrypted keys, refer to Documentation/keys-trusted-encrypted.txt. A
sample dracut patch, which loads the trusted/encrypted key and enables EVM, is
available from http://linux-ima.sourceforge.net/#EVM.
Based on the LSMs enabled, the set of EVM protected security xattrs is defined
at compile. EVM adds the following three calls to the existing security hooks:
evm_inode_setxattr(), evm_inode_post_setxattr(), and evm_inode_removexattr. To
initialize and update the 'security.evm' extended attribute, EVM defines three
calls: evm_inode_post_init(), evm_inode_post_setattr() and
evm_inode_post_removexattr() hooks. To verify the integrity of a security
xattr, EVM exports evm_verifyxattr().
Changelog v7:
- Fixed URL in EVM ABI documentation
Changelog v6: (based on Serge Hallyn's review)
- fix URL in patch description
- remove evm_hmac_size definition
- use SHA1_DIGEST_SIZE (removed both MAX_DIGEST_SIZE and evm_hmac_size)
- moved linux include before other includes
- test for crypto_hash_setkey failure
- fail earlier for invalid key
- clear entire encrypted key, even on failure
- check xattr name length before comparing xattr names
Changelog:
- locking based on i_mutex, remove evm_mutex
- using trusted/encrypted keys for storing the EVM key used in the HMAC-sha1
operation.
- replaced crypto hash with shash (Dmitry Kasatkin)
- support for additional methods of verifying the security xattrs
(Dmitry Kasatkin)
- iint not allocated for all regular files, but only for those appraised
- Use cap_sys_admin in lieu of cap_mac_admin
- Use __vfs_setxattr_noperm(), without permission checks, from EVM
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Move the inode integrity data(iint) management up to the integrity directory
in order to share the iint among the different integrity models.
Changelog:
- don't define MAX_DIGEST_SIZE
- rename several globally visible 'ima_' prefixed functions, structs,
locks, etc to 'integrity_'
- replace '20' with SHA1_DIGEST_SIZE
- reflect location change in appropriate Kconfig and Makefiles
- remove unnecessary initialization of iint_initialized to 0
- rebased on current ima_iint.c
- define integrity_iint_store/lock as static
There should be no other functional changes.
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
This patch changes the security_inode_init_security API by adding a
filesystem specific callback to write security extended attributes.
This change is in preparation for supporting the initialization of
multiple LSM xattrs and the EVM xattr. Initially the callback function
walks an array of xattrs, writing each xattr separately, but could be
optimized to write multiple xattrs at once.
For existing security_inode_init_security() calls, which have not yet
been converted to use the new callback function, such as those in
reiserfs and ocfs2, this patch defines security_old_inode_init_security().
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Update comments for scripts/kernel-doc and fix some of errors reported by
scripts/checkpatch.pl .
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
This patch adds support for permission checks using argv[]/envp[] of execve()
request. Hooks are in the last patch of this pathset.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
This patch adds support for permission checks using executable file's realpath
upon execve() and symlink's target upon symlink(). Hooks are in the last patch
of this pathset.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
This patch adds support for permission checks using file object's DAC
attributes (e.g. owner/group) when checking file's pathnames. Hooks for passing
file object's pointers are in the last patch of this pathset.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
This patch adds support for permission checks using current thread's UID/GID
etc. in addition to pathnames.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
Since ca5ecddf (rcu: define __rcu address space modifier for sparse)
rcu_dereference_check use rcu_read_lock_held as a part of condition
automatically so callers do not have to do that as well.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
/sys/kernel/security/tomoyo/.domain_status can be easily emulated using
/sys/kernel/security/tomoyo/domain_policy . We can remove this interface by
updating /usr/sbin/tomoyo-setprofile utility.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
I forgot to add #ifndef in commit 0e4ae0e0 "TOMOYO: Make several options
configurable.", resulting
security/built-in.o: In function `tomoyo_bprm_set_creds':
tomoyo.c:(.text+0x4698e): undefined reference to `tomoyo_load_policy'
error.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
AppArmor is masking the capabilities returned by capget against the
capabilities mask in the profile. This is wrong, in complain mode the
profile has effectively all capabilities, as the profile restrictions are
not being enforced, merely tested against to determine if an access is
known by the profile.
This can result in the wrong behavior of security conscience applications
like sshd which examine their capability set, and change their behavior
accordingly. In this case because of the masked capability set being
returned sshd fails due to DAC checks, even when the profile is in complain
mode.
Kernels affected: 2.6.36 - 3.0.
Signed-off-by: John Johansen <john.johansen@canonical.com>
The pointer returned from tracehook_tracer_task() is only valid inside
the rcu_read_lock. However the tracer pointer obtained is being passed
to aa_may_ptrace outside of the rcu_read_lock critical section.
Mover the aa_may_ptrace test into the rcu_read_lock critical section, to
fix this.
Kernels affected: 2.6.36 - 3.0
Reported-by: Oleg Nesterov <oleg@redhat.com>
Cc: stable@kernel.org
Signed-off-by: John Johansen <john.johansen@canonical.com>
To be able to start using enforcing mode from the early stage of boot sequence,
this patch adds support for activating access control without calling external
policy loader program. This will be useful for systems where operations which
can lead to the hijacking of the boot sequence are needed before loading the
policy. For example, you can activate immediately after loading the fixed part
of policy which will allow only operations needed for mounting a partition
which contains the variant part of policy and verifying (e.g. running GPG
check) and loading the variant part of policy. Since you can start using
enforcing mode from the beginning, you can reduce the possibility of hijacking
the boot sequence.
This patch makes several variables configurable on build time. This patch also
adds TOMOYO_loader= and TOMOYO_trigger= kernel command line option to boot the
same kernel in two different init systems (BSD-style init and systemd).
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
To be able to start using enforcing mode from the early stage of boot sequence,
this patch adds support for built-in policy configuration (and next patch adds
support for activating access control without calling external policy loader
program).
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
Show statistics such as last policy update time and last policy violation time
in addition to memory usage.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
Gather string constants to one file in order to make the object size smaller.
Use unsigned type where appropriate.
read()/write() returns ssize_t.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
Currently TOMOYO holds SRCU lock upon open() and releases it upon close()
because list elements stored in the "struct tomoyo_io_buffer" instances are
accessed until close() is called. However, such SRCU usage causes lockdep to
complain about leaving the kernel with SRCU lock held.
This patch solves the warning by holding/releasing SRCU upon each
read()/write(). This patch is doing something similar to calling kfree()
without calling synchronize_srcu(), by selectively deferring kfree() by keeping
track of the "struct tomoyo_io_buffer" instances.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
TOMOYO wants to use /proc/self/ rather than /proc/$PID/ if $PID matches current
thread's process ID in order to prevent current thread from accessing other
process's information unless needed.
But since procfs can be mounted on various locations (e.g. /proc/ /proc2/ /p/
/tmp/foo/100/p/ ), TOMOYO cannot tell that whether the numeric part in the
string returned by __d_path() represents process ID or not.
Therefore, to be able to convert from $PID to self no matter where procfs is
mounted, this patch changes pathname representations for filesystems which do
not support rename() operation (e.g. proc, sysfs, securityfs).
Examples:
/proc/self/mounts => proc:/self/mounts
/sys/kernel/security/ => sys:/kernel/security/
/dev/pts/0 => devpts:/0
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
Mauras Olivier reported that it is difficult to use TOMOYO in LXC environments,
for TOMOYO cannot distinguish between environments outside the container and
environments inside the container since LXC environments are created using
pivot_root(). To address this problem, this patch introduces policy namespace.
Each policy namespace has its own set of domain policy, exception policy and
profiles, which are all independent of other namespaces. This independency
allows users to develop policy without worrying interference among namespaces.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
ACL group allows administrator to globally grant not only "file read"
permission but also other permissions.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
Add /sys/kernel/security/tomoyo/audit interface. This interface generates audit
logs in the form of domain policy so that /usr/sbin/tomoyo-auditd can reuse
audit logs for appending to /sys/kernel/security/tomoyo/domain_policy
interface.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
Remove global preference from profile structure in order to make code simpler.
Due to this structure change, printk() warnings upon policy violation are
temporarily disabled. They will be replaced by
/sys/kernel/security/tomoyo/audit by next patch.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
Convert "allow_..." style directives to "file ..." style directives.
By converting to the latter style, we can pack policy like
"file read/write/execute /path/to/file".
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
Use structure for passing ACL line, in preparation for supporting policy
namespace and conditional parameters.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
Use common structure for ACL with "struct list_head" + "atomic_t".
Use array/struct where possible.
Remove is_group from "struct tomoyo_name_union"/"struct tomoyo_number_union".
Pass "struct file"->private_data rather than "struct file".
Update some of comments.
Bring tomoyo_same_acl_head() from common.h to domain.c .
Bring tomoyo_invalid()/tomoyo_valid() from common.h to util.c .
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
Update (or temporarily remove) comments.
Remove or replace some of #define lines.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
In order to synchronize with TOMOYO 1.8's syntax,
(1) Remove special handling for allow_read/write permission.
(2) Replace deny_rewrite/allow_rewrite permission with allow_append permission.
(3) Remove file_pattern keyword.
(4) Remove allow_read permission from exception policy.
(5) Allow creating domains in enforcing mode without calling supervisor.
(6) Add permission check for opening directory for reading.
(7) Add permission check for stat() operation.
(8) Make "cat < /sys/kernel/security/tomoyo/self_domain" behave as if
"cat /sys/kernel/security/tomoyo/self_domain".
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
The 'encrypted' key type defines its own payload format which contains a
symmetric key randomly generated that cannot be used directly to mount
an eCryptfs filesystem, because it expects an authentication token
structure.
This patch introduces the new format 'ecryptfs' that allows to store an
authentication token structure inside the encrypted key payload containing
a randomly generated symmetric key, as the same for the format 'default'.
More details about the usage of encrypted keys with the eCryptfs
filesystem can be found in the file 'Documentation/keys-ecryptfs.txt'.
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Acked-by: Gianluca Ramunno <ramunno@polito.it>
Acked-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
This patch introduces a new parameter, called 'format', that defines the
format of data stored by encrypted keys. The 'default' format identifies
encrypted keys containing only the symmetric key, while other formats can
be defined to support additional information. The 'format' parameter is
written in the datablob produced by commands 'keyctl print' or
'keyctl pipe' and is integrity protected by the HMAC.
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Acked-by: Gianluca Ramunno <ramunno@polito.it>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Some debug messages have been added in the function datablob_parse() in
order to better identify errors returned when dealing with 'encrypted'
keys.
Changelog from version v4:
- made the debug messages more understandable
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Acked-by: Gianluca Ramunno <ramunno@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Do not dump the master key if an error is encountered during the request.
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Acked-by: Gianluca Ramunno <ramunno@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
tracehook.h is on the way out. Rename tracehook_tracer_task() to
ptrace_parent() and move it from tracehook.h to ptrace.h.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Remove linux/mm.h inclusion from netdevice.h -- it's unused (I've checked manually).
To prevent mm.h inclusion via other channels also extract "enum dma_data_direction"
definition into separate header. This tiny piece is what gluing netdevice.h with mm.h
via "netdevice.h => dmaengine.h => dma-mapping.h => scatterlist.h => mm.h".
Removal of mm.h from scatterlist.h was tried and was found not feasible
on most archs, so the link was cutoff earlier.
Hope people are OK with tiny include file.
Note, that mm_types.h is still dragged in, but it is a separate story.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix error handling in construct_key_and_link().
If construct_alloc_key() returns an error, it shouldn't pass out through
the normal path as the key_serial() called by the kleave() statement
will oops when it gets an error code in the pointer:
BUG: unable to handle kernel paging request at ffffffffffffff84
IP: [<ffffffff8120b401>] request_key_and_link+0x4d7/0x52f
..
Call Trace:
[<ffffffff8120b52c>] request_key+0x41/0x75
[<ffffffffa00ed6e8>] cifs_get_spnego_key+0x206/0x226 [cifs]
[<ffffffffa00eb0c9>] CIFS_SessSetup+0x511/0x1234 [cifs]
[<ffffffffa00d9799>] cifs_setup_session+0x90/0x1ae [cifs]
[<ffffffffa00d9c02>] cifs_get_smb_ses+0x34b/0x40f [cifs]
[<ffffffffa00d9e05>] cifs_mount+0x13f/0x504 [cifs]
[<ffffffffa00caabb>] cifs_do_mount+0xc4/0x672 [cifs]
[<ffffffff8113ae8c>] mount_fs+0x69/0x155
[<ffffffff8114ff0e>] vfs_kern_mount+0x63/0xa0
[<ffffffff81150be2>] do_kern_mount+0x4d/0xdf
[<ffffffff81152278>] do_mount+0x63c/0x69f
[<ffffffff8115255c>] sys_mount+0x88/0xc2
[<ffffffff814fbdc2>] system_call_fastpath+0x16/0x1b
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
devcgroup_inode_permission: take "is it a device node" checks to inlined wrapper
fix comment in generic_permission()
kill obsolete comment for follow_down()
proc_sys_permission() is OK in RCU mode
reiserfs_permission() doesn't need to bail out in RCU mode
proc_fd_permission() is doesn't need to bail out in RCU mode
nilfs2_permission() doesn't need to bail out in RCU mode
logfs doesn't need ->permission() at all
coda_ioctl_permission() is safe in RCU mode
cifs_permission() doesn't need to bail out in RCU mode
bad_inode_permission() is safe from RCU mode
ubifs: dereferencing an ERR_PTR in ubifs_mount()
inode_permission() calls devcgroup_inode_permission() and almost all such
calls are _not_ for device nodes; let's at least keep the common path
straight...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
____call_usermodehelper() now erases any credentials set by the
subprocess_inf::init() function. The problem is that commit
17f60a7da1 ("capabilites: allow the application of capability limits
to usermode helpers") creates and commits new credentials with
prepare_kernel_cred() after the call to the init() function. This wipes
all keyrings after umh_keys_init() is called.
The best way to deal with this is to put the init() call just prior to
the commit_creds() call, and pass the cred pointer to init(). That
means that umh_keys_init() and suchlike can modify the credentials
_before_ they are published and potentially in use by the rest of the
system.
This prevents request_key() from working as it is prevented from passing
the session keyring it set up with the authorisation token to
/sbin/request-key, and so the latter can't assume the authority to
instantiate the key. This causes the in-kernel DNS resolver to fail
with ENOKEY unconditionally.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Tested-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When policy version is less than POLICYDB_VERSION_FILENAME_TRANS,
skip file_name_trans_write().
Signed-off-by: Roy.Li <rongqing.li@windriver.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
In tomoyo_mount_acl() since 2.6.36, kern_path() was called without checking
dev_name != NULL. As a result, an unprivileged user can trigger oops by issuing
mount(NULL, "/", "ext3", 0, NULL) request.
Fix this by checking dev_name != NULL before calling kern_path(dev_name).
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable@kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
Don't return EAGAIN to keyctl_assume_authority() to indicate that a key could
not be found (ENOKEY is only returned if a negative key is found). Instead
return ENOKEY in both cases.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Affected kernels 2.6.36 - 3.0
AppArmor may do a GFP_KERNEL memory allocation with task_lock(tsk->group_leader);
held when called from security_task_setrlimit. This will only occur when the
task's current policy has been replaced, and the task's creds have not been
updated before entering the LSM security_task_setrlimit() hook.
BUG: sleeping function called from invalid context at mm/slub.c:847
in_atomic(): 1, irqs_disabled(): 0, pid: 1583, name: cupsd
2 locks held by cupsd/1583:
#0: (tasklist_lock){.+.+.+}, at: [<ffffffff8104dafa>] do_prlimit+0x61/0x189
#1: (&(&p->alloc_lock)->rlock){+.+.+.}, at: [<ffffffff8104db2d>]
do_prlimit+0x94/0x189
Pid: 1583, comm: cupsd Not tainted 3.0.0-rc2-git1 #7
Call Trace:
[<ffffffff8102ebf2>] __might_sleep+0x10d/0x112
[<ffffffff810e6f46>] slab_pre_alloc_hook.isra.49+0x2d/0x33
[<ffffffff810e7bc4>] kmem_cache_alloc+0x22/0x132
[<ffffffff8105b6e6>] prepare_creds+0x35/0xe4
[<ffffffff811c0675>] aa_replace_current_profile+0x35/0xb2
[<ffffffff811c4d2d>] aa_current_profile+0x45/0x4c
[<ffffffff811c4d4d>] apparmor_task_setrlimit+0x19/0x3a
[<ffffffff811beaa5>] security_task_setrlimit+0x11/0x13
[<ffffffff8104db6b>] do_prlimit+0xd2/0x189
[<ffffffff8104dea9>] sys_setrlimit+0x3b/0x48
[<ffffffff814062bb>] system_call_fastpath+0x16/0x1b
Signed-off-by: John Johansen <john.johansen@canonical.com>
Reported-by: Miles Lane <miles.lane@gmail.com>
Cc: stable@kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
This is a rather hot function that is called with a potentially NULL
"struct common_audit_data" pointer argument. And in that case it has to
provide and initialize its own dummy common_audit_data structure.
However, all the _common_ cases already pass it a real audit-data
structure, so that uncommon NULL case not only creates a silly run-time
test, more importantly it causes that function to have a big stack frame
for the dummy variable that isn't even used in the common case!
So get rid of that stupid run-time behavior, and make the (few)
functions that currently call with a NULL pointer just call a new helper
function instead (naturally called inode_has_perm_noapd(), since it has
no adp argument).
This makes the run-time test be a static code generation issue instead,
and allows for a much denser stack since none of the common callers need
the dummy structure. And a denser stack not only means less stack space
usage, it means better cache behavior. So we have a win-win-win from
this simplification: less code executed, smaller stack footprint, and
better cache behavior.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When invalid parameters are passed to apparmor_setprocattr a NULL deref
oops occurs when it tries to record an audit message. This is because
it is passing NULL for the profile parameter for aa_audit. But aa_audit
now requires that the profile passed is not NULL.
Fix this by passing the current profile on the task that is trying to
setprocattr.
Signed-off-by: Kees Cook <kees@ubuntu.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Cc: stable@kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
* 'docs-move' of git://git.kernel.org/pub/scm/linux/kernel/git/rdunlap/linux-docs:
Create Documentation/security/, move LSM-, credentials-, and keys-related files from Documentation/ to Documentation/security/, add Documentation/security/00-INDEX, and update all occurrences of Documentation/<moved_file> to Documentation/security/<moved_file>.
Right now security_get_user_sids() will pass in a NULL avd pointer to
avc_has_perm_noaudit(), which then forces that function to have a dummy
entry for that case and just generally test it.
Don't do it. The normal callers all pass a real avd pointer, and this
helper function is incredibly hot. So don't make avc_has_perm_noaudit()
do conditional stuff that isn't needed for the common case.
This also avoids some duplicated stack space.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add cgroup subsystem callbacks for per-thread attachment in atomic contexts
Add can_attach_task(), pre_attach(), and attach_task() as new callbacks
for cgroups's subsystem interface. Unlike can_attach and attach, these
are for per-thread operations, to be called potentially many times when
attaching an entire threadgroup.
Also, the old "bool threadgroup" interface is removed, as replaced by
this. All subsystems are modified for the new interface - of note is
cpuset, which requires from/to nodemasks for attach to be globally scoped
(though per-cpuset would work too) to persist from its pre_attach to
attach_task and attach.
This is a pre-patch for cgroup-procs-writable.patch.
Signed-off-by: Ben Blum <bblum@andrew.cmu.edu>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Reviewed-by: Paul Menage <menage@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I submit the patch again, according to patch submission convension.
This patch enables to accept percent-encoded object names as forth
argument of /selinux/create interface to avoid possible bugs when we
give an object name including whitespace or multibutes.
E.g) if and when a userspace object manager tries to create a new object
named as "resolve.conf but fake", it shall give this name as the forth
argument of the /selinux/create. But sscanf() logic in kernel space
fetches only the part earlier than the first whitespace.
In this case, selinux may unexpectedly answer a default security context
configured to "resolve.conf", but it is bug.
Although I could not test this patch on named TYPE_TRANSITION rules
actually, But debug printk() message seems to me the logic works
correctly.
I assume the libselinux provides an interface to apply this logic
transparently, so nothing shall not be changed from the viewpoint of
application.
Signed-off-by: KaiGai Kohei <kohei.kaigai@emea.nec.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Since this cred was not created with copy_creds(), it needs to get
initialized. Otherwise use of syscall(__NR_keyctl, KEYCTL_SESSION_TO_PARENT);
can lead to a NULL deref. Thanks to Robert for finding this.
But introduced by commit 47a150edc2 ("Cache user_ns in struct cred").
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Reported-by: Robert Święcki <robert@swiecki.net>
Cc: David Howells <dhowells@redhat.com>
Cc: stable@kernel.org (2.6.39)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
b43: fix comment typo reqest -> request
Haavard Skinnemoen has left Atmel
cris: typo in mach-fs Makefile
Kconfig: fix copy/paste-ism for dell-wmi-aio driver
doc: timers-howto: fix a typo ("unsgined")
perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
treewide: fix a few typos in comments
regulator: change debug statement be consistent with the style of the rest
Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
audit: acquire creds selectively to reduce atomic op overhead
rtlwifi: don't touch with treewide double semicolon removal
treewide: cleanup continuations and remove logging message whitespace
ath9k_hw: don't touch with treewide double semicolon removal
include/linux/leds-regulator.h: fix syntax in example code
tty: fix typo in descripton of tty_termios_encode_baud_rate
xtensa: remove obsolete BKL kernel option from defconfig
m68k: fix comment typo 'occcured'
arch:Kconfig.locks Remove unused config option.
treewide: remove extra semicolons
...
There is no point in counting hits - we can calculate it from the number
of lookups and misses.
This makes the avc statistics a bit smaller, and makes the code
generation better too.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
You can turn off the avc cache stats, but distributions seem to not do
that (perhaps because several performance tuning how-to's talk about the
avc cache statistics).
Which is sad, because the code it generates is truly horrendous, with
the statistics update being sandwitched between get_cpu/put_cpu which in
turn causes preemption disables etc. We're talking ten+ instructions
just to increment a per-cpu variable in some pretty hot code.
Fix the craziness by just using 'this_cpu_inc()' instead. Suddenly we
only need a single 'inc' instruction to increment the statistics. This
is quite noticeable in the incredibly hot avc_has_perm_noaudit()
function (which triggers all the statistics by virtue of doing an
avc_lookup() call).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
move LSM-, credentials-, and keys-related files from Documentation/
to Documentation/security/,
add Documentation/security/00-INDEX, and
update all occurrences of Documentation/<moved_file>
to Documentation/security/<moved_file>.
The filename_trans rule processing has some printk(KERN_ERR ) messages
which were intended as debug aids in creating the code but weren't removed
before it was submitted. Remove them.
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Eric Paris <eparis@redhat.com>
In tomoyo_correct_domain() since 2.6.36, TOMOYO was by error validating
"<kernel>" + "/foo/\" + "/bar" when "<kernel> /foo/\* /bar" was given.
As a result, legal domainnames like "<kernel> /foo/\* /bar" are rejected.
Reported-by: Hayama Yossihiro <yossi@yedo.src.co.jp>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
In the interest of keeping userspace from having to create new root
filesystems all the time, let's follow the lead of the other in-kernel
filesystems and provide a proper mount point for it in sysfs.
For selinuxfs, this mount point should be in /sys/fs/selinux/
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Lennart Poettering <mzerqung@0pointer.de>
Cc: Daniel J Walsh <dwalsh@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[include kobject.h - Eric Paris]
[use selinuxfs_obj throughout - Eric Paris]
Signed-off-by: Eric Paris <eparis@redhat.com>
The rcu callback sel_netif_free() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(sel_netif_free).
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
The rcu callback user_update_rcu_disposal() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(user_update_rcu_disposal).
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Change flex_array_prealloc to take the number of elements for which space
should be allocated instead of the last (inclusive) element. Users
and documentation are updated accordingly. flex_arrays got introduced before
they had users. When folks started using it, they ended up needing a
different API than was coded up originally. This swaps over to the API that
folks apparently need.
Based-on-patch-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Tested-by: Chris Richards <gizmo@giz-works.com>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: stable@kernel.org [2.6.38+]
New inodes are created in a two stage process. We first will compute the
label on a new inode in security_inode_create() and check if the
operation is allowed. We will then actually re-compute that same label and
apply it in security_inode_init_security(). The change to do new label
calculations based in part on the last component of the path name only
passed the path component information all the way down the
security_inode_init_security hook. Down the security_inode_create hook the
path information did not make it past may_create. Thus the two calculations
came up differently and the permissions check might not actually be against
the label that is created. Pass and use the same information in both places
to harmonize the calculations and checks.
Reported-by: Dominick Grift <domg472@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
We currently have inode_has_perm and dentry_has_perm. dentry_has_perm just
calls inode_has_perm with additional audit data. But dentry_has_perm can
take either a dentry or a path. Split those to make the code obvious and
to fix the previous problem where I thought dentry_has_perm always had a
valid dentry and mnt.
Signed-off-by: Eric Paris <eparis@redhat.com>
Change flex_array_prealloc to take the number of elements for which space
should be allocated instead of the last (inclusive) element. Users
and documentation are updated accordingly. flex_arrays got introduced before
they had users. When folks started using it, they ended up needing a
different API than was coded up originally. This swaps over to the API that
folks apparently need.
Based-on-patch-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Tested-by: Chris Richards <gizmo@giz-works.com>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: stable@kernel.org [2.6.38+]
New inodes are created in a two stage process. We first will compute the
label on a new inode in security_inode_create() and check if the
operation is allowed. We will then actually re-compute that same label and
apply it in security_inode_init_security(). The change to do new label
calculations based in part on the last component of the path name only
passed the path component information all the way down the
security_inode_init_security hook. Down the security_inode_create hook the
path information did not make it past may_create. Thus the two calculations
came up differently and the permissions check might not actually be against
the label that is created. Pass and use the same information in both places
to harmonize the calculations and checks.
Reported-by: Dominick Grift <domg472@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
To shorten the list we need to run if filename trans rules exist for the type
of the given parent directory I put them in a hashtable. Given the policy we
are expecting to use in Fedora this takes the worst case list run from about
5,000 entries to 17.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Instead of a hashtab entry counter function only useful for range
transition rules make a function generic for any hashtable to use.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
We have custom debug functions like rangetr_hash_eval and symtab_hash_eval
which do the same thing. Just create a generic function that takes the name
of the hash table as an argument instead of having custom functions.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Right now we walk to filename trans rule list for every inode that is
created. First passes at policy using this facility creates around 5000
filename trans rules. Running a list of 5000 entries every time is a bad
idea. This patch adds a new ebitmap to policy which has a bit set for each
ttype that has at least 1 filename trans rule. Thus when an inode is
created we can quickly determine if any rules exist for this parent
directory type and can skip the list if we know there is definitely no
relevant entry.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
filename_compute_type() takes as arguments the numeric value of the type of
the subject and target. It does not take a context. Thus the names are
misleading. Fix the argument names.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
filename_compute_type used to take a qstr, but it now takes just a name.
Fix the comments to indicate it is an objname, not a qstr.
Signed-off-by: Eric Paris <eparis@redhat.com>
Now that the security modules can decide whether they support the
dcache RCU walk or not it's possible to make selinux a bit more
RCU friendly. The SELinux AVC and security server access decision
code is RCU safe. A specific piece of the LSM audit code may not
be RCU safe.
This patch makes the VFS RCU walk retry if it would hit the non RCU
safe chunk of code. It will normally just work under RCU. This is
done simply by passing the VFS RCU state as a flag down into the
avc_audit() code and returning ECHILD there if it would have an issue.
Based-on-patch-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
smack_file_lock has a struct path, so use that instead of only the
dentry.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
This patch separates and audit message that only contains a dentry from
one that contains a full path. This allows us to make it harder to
misuse the interfaces or for the interfaces to be implemented wrong.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
The lsm common audit code has wacky contortions making sure which pieces
of information are set based on if it was given a path, dentry, or
inode. Split this into path and inode to get rid of some of the code
complexity.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Now that the security modules can decide whether they support the
dcache RCU walk or not it's possible to make selinux a bit more
RCU friendly. The SELinux AVC and security server access decision
code is RCU safe. A specific piece of the LSM audit code may not
be RCU safe.
This patch makes the VFS RCU walk retry if it would hit the non RCU
safe chunk of code. It will normally just work under RCU. This is
done simply by passing the VFS RCU state as a flag down into the
avc_audit() code and returning ECHILD there if it would have an issue.
Based-on-patch-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Right now all RCU walks fall back to reference walk when CONFIG_SECURITY
is enabled, even though just the standard capability module is active.
This is because security_inode_exec_permission unconditionally fails
RCU walks.
Move this decision to the low level security module. This requires
passing the RCU flags down the security hook. This way at least
the capability module and a few easy cases in selinux/smack work
with RCU walks with CONFIG_SECURITY=y
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
The len should be an size_t but is a ssize_t. Easy enough fix to silence
build warnings. We have no need for signed-ness.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
If one builds a kernel without CONFIG_BUG there are a number of 'may be
used uninitialized' warnings. Silence these by returning after the BUG().
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Right now all RCU walks fall back to reference walk when CONFIG_SECURITY
is enabled, even though just the standard capability module is active.
This is because security_inode_exec_permission unconditionally fails
RCU walks.
Move this decision to the low level security module. This requires
passing the RCU flags down the security hook. This way at least
the capability module and a few easy cases in selinux/smack work
with RCU walks with CONFIG_SECURITY=y
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The filename_trans rule processing has some printk(KERN_ERR ) messages
which were intended as debug aids in creating the code but weren't removed
before it was submitted. Remove them.
Signed-off-by: Eric Paris <eparis@redhat.com>
In tomoyo_mount_acl() since 2.6.36, reference to device file (e.g. /dev/sda1)
was leaking.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
In tomoyo_flush(), head->r.w[0] holds pointer to string data to be printed.
But head->r.w[0] was updated only when the string data was partially
printed (because head->r.w[0] will be updated by head->r.w[1] later if
completely printed). However, regarding /sys/kernel/security/tomoyo/query ,
an additional '\0' is printed after the string data was completely printed.
But if free space for read buffer became 0 before printing the additional '\0',
tomoyo_flush() was returning without updating head->r.w[0]. As a result,
tomoyo_flush() forever reprints already printed string data.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
"mount --bind /path/to/file1 /path/to/file2" is legal. Therefore,
"umount /path/to/file2" is also legal. Do not automatically append trailing '/'
if pathname to be unmounted does not end with '/'.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
In tomoyo_write_profile() since 2.6.34, a lock was by error missing when
replacing profile's comment line. If multiple threads attempted
echo '0-COMMENT=comment' > /sys/kernel/security/tomoyo/profile
in parallel, garbage collector will fail to kfree() the old value.
Protect the replacement using a lock. Also, keep the old value rather than
replace with empty string when out of memory error has occurred.
Signed-off-by: Xiaochen Wang <wangxiaochen0@gmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
Initialize policydb.process_class once all symtabs read from policy image,
so that it could be used to setup the role_trans.tclass field when a lower
version policy.X is loaded.
Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Commit 6f5317e730 introduced a bug in the
handling of userspace object classes that is causing breakage for Xorg
when XSELinux is enabled. Fix the bug by changing map_class() to return
SECCLASS_NULL when the class cannot be mapped to a kernel object class.
Reported-by: "Justin P. Mattock" <justinmattock@gmail.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
When the global init task is exec'd we have special case logic to make sure
the pE is not reduced. There is no reason for this. If init wants to drop
it's pE is should be allowed to do so. Remove this special logic.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Signed-off-by: James Morris <jmorris@namei.org>
The attached patch allows /selinux/create takes optional 4th argument
to support TYPE_TRANSITION with name extension for userspace object
managers.
If 4th argument is not supplied, it shall perform as existing kernel.
In fact, the regression test of SE-PostgreSQL works well on the patched
kernel.
Thanks,
Signed-off-by: KaiGai Kohei <kohei.kaigai@eu.nec.com>
[manually verify fuzz was not an issue, and it wasn't: eparis]
Signed-off-by: Eric Paris <eparis@redhat.com>
When memory used for policy exceeds the quota, tomoyo_memory_ok() return false.
In this case, tomoyo_commit_ok() must call kfree() before returning NULL.
This bug exists since 2.6.35.
Signed-off-by: Xiaochen Wang <wangxiaochen0@gmail.com>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
Commit 6f5317e730 introduced a bug in the
handling of userspace object classes that is causing breakage for Xorg
when XSELinux is enabled. Fix the bug by changing map_class() to return
SECCLASS_NULL when the class cannot be mapped to a kernel object class.
Reported-by: "Justin P. Mattock" <justinmattock@gmail.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
If kernel policy version is >= 26, then write the class field of the
role_trans structure into the binary reprensentation.
Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Eric Paris <eparis@redhat.com>
Apply role_transition rules for all kinds of classes.
Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Eric Paris <eparis@redhat.com>
If kernel policy version is >= 26, then the binary representation of
the role_trans structure supports specifying the class for the current
subject or the newly created object.
If kernel policy version is < 26, then the class field would be default
to the process class.
Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Eric Paris <eparis@redhat.com>
And give it a kernel-doc comment.
[akpm@linux-foundation.org: btrfs changed in linux-next]
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ptrace is allowed to tasks in the same user namespace according to the
usual rules (i.e. the same rules as for two tasks in the init user
namespace). ptrace is also allowed to a user namespace to which the
current task the has CAP_SYS_PTRACE capability.
Changelog:
Dec 31: Address feedback by Eric:
. Correct ptrace uid check
. Rename may_ptrace_ns to ptrace_capable
. Also fix the cap_ptrace checks.
Jan 1: Use const cred struct
Jan 11: use task_ns_capable() in place of ptrace_capable().
Feb 23: same_or_ancestore_user_ns() was not an appropriate
check to constrain cap_issubset. Rather, cap_issubset()
only is meaningful when both capsets are in the same
user_ns.
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Introduce ns_capable to test for a capability in a non-default
user namespace.
- Teach cap_capable to handle capabilities in a non-default
user namespace.
The motivation is to get to the unprivileged creation of new
namespaces. It looks like this gets us 90% of the way there, with
only potential uid confusion issues left.
I still need to handle getting all caps after creation but otherwise I
think I have a good starter patch that achieves all of your goals.
Changelog:
11/05/2010: [serge] add apparmor
12/14/2010: [serge] fix capabilities to created user namespaces
Without this, if user serge creates a user_ns, he won't have
capabilities to the user_ns he created. THis is because we
were first checking whether his effective caps had the caps
he needed and returning -EPERM if not, and THEN checking whether
he was the creator. Reverse those checks.
12/16/2010: [serge] security_real_capable needs ns argument in !security case
01/11/2011: [serge] add task_ns_capable helper
01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion
02/16/2011: [serge] fix a logic bug: the root user is always creator of
init_user_ns, but should not always have capabilities to
it! Fix the check in cap_capable().
02/21/2011: Add the required user_ns parameter to security_capable,
fixing a compile failure.
02/23/2011: Convert some macros to functions as per akpm comments. Some
couldn't be converted because we can't easily forward-declare
them (they are inline if !SECURITY, extern if SECURITY). Add
a current_user_ns function so we can use it in capability.h
without #including cred.h. Move all forward declarations
together to the top of the #ifdef __KERNEL__ section, and use
kernel-doc format.
02/23/2011: Per dhowells, clean up comment in cap_capable().
02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable.
(Original written and signed off by Eric; latest, modified version
acked by him)
[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: export current_user_ns() for ecryptfs]
[serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The default for this is universally set to 64k, but the help says:
For most ia64, ppc64 and x86 users with lots of address space
a value of 65536 is reasonable and should cause no problems.
On arm and other archs it should not be higher than 32768.
The text is right, in that we are seeing selinux-enabled ARM targets
that fail to launch /sbin/init because selinux blocks a memory map.
So select the right value if we know we are building ARM.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: James Morris <jmorris@namei.org>
Make request_key() and co. return an error for a negative or rejected key. If
the key was simply negated, then return ENOKEY, otherwise return the error
with which it was rejected.
Without this patch, the following command returns a key number (with the latest
keyutils):
[root@andromeda ~]# keyctl request2 user debug:foo rejected @s
586569904
Trying to print the key merely gets you a permission denied error:
[root@andromeda ~]# keyctl print 586569904
keyctl_read_alloc: Permission denied
Doing another request_key() call does get you the error, as long as it hasn't
expired yet:
[root@andromeda ~]# keyctl request user debug:foo
request_key: Key was rejected by service
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Improve /proc/keys by:
(1) Don't attempt to summarise the payload of a negated key. It won't have
one. To this end, a helper function - key_is_instantiated() has been
added that allows the caller to find out whether the key is positively
instantiated (as opposed to being uninstantiated or negatively
instantiated).
(2) Do show keys that are negative, expired or revoked rather than hiding
them. This requires an override flag (no_state_check) to be passed to
search_my_process_keyrings() and keyring_search_aux() to suppress this
check.
Without this, keys that are possessed by the caller, but only grant
permissions to the caller if possessed are skipped as the possession check
fails.
Keys that are visible due to user, group or other checks are visible with
or without this patch.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1480 commits)
bonding: enable netpoll without checking link status
xfrm: Refcount destination entry on xfrm_lookup
net: introduce rx_handler results and logic around that
bonding: get rid of IFF_SLAVE_INACTIVE netdev->priv_flag
bonding: wrap slave state work
net: get rid of multiple bond-related netdevice->priv_flags
bonding: register slave pointer for rx_handler
be2net: Bump up the version number
be2net: Copyright notice change. Update to Emulex instead of ServerEngines
e1000e: fix kconfig for crc32 dependency
netfilter ebtables: fix xt_AUDIT to work with ebtables
xen network backend driver
bonding: Improve syslog message at device creation time
bonding: Call netif_carrier_off after register_netdevice
bonding: Incorrect TX queue offset
net_sched: fix ip_tos2prio
xfrm: fix __xfrm_route_forward()
be2net: Fix UDP packet detected status in RX compl
Phonet: fix aligned-mode pipe socket buffer header reserve
netxen: support for GbE port settings
...
Fix up conflicts in drivers/staging/brcm80211/brcmsmac/wl_mac80211.c
with the staging updates.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (33 commits)
AppArmor: kill unused macros in lsm.c
AppArmor: cleanup generated files correctly
KEYS: Add an iovec version of KEYCTL_INSTANTIATE
KEYS: Add a new keyctl op to reject a key with a specified error code
KEYS: Add a key type op to permit the key description to be vetted
KEYS: Add an RCU payload dereference macro
AppArmor: Cleanup make file to remove cruft and make it easier to read
SELinux: implement the new sb_remount LSM hook
LSM: Pass -o remount options to the LSM
SELinux: Compute SID for the newly created socket
SELinux: Socket retains creator role and MLS attribute
SELinux: Auto-generate security_is_socket_class
TOMOYO: Fix memory leak upon file open.
Revert "selinux: simplify ioctl checking"
selinux: drop unused packet flow permissions
selinux: Fix packet forwarding checks on postrouting
selinux: Fix wrong checks for selinux_policycap_netpeer
selinux: Fix check for xfrm selinux context algorithm
ima: remove unnecessary call to ima_must_measure
IMA: remove IMA imbalance checking
...
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (62 commits)
posix-clocks: Check write permissions in posix syscalls
hrtimer: Remove empty hrtimer_init_hres_timer()
hrtimer: Update hrtimer->state documentation
hrtimer: Update base[CLOCK_BOOTTIME].offset correctly
timers: Export CLOCK_BOOTTIME via the posix timers interface
timers: Add CLOCK_BOOTTIME hrtimer base
time: Extend get_xtime_and_monotonic_offset() to also return sleep
time: Introduce get_monotonic_boottime and ktime_get_boottime
hrtimers: extend hrtimer base code to handle more then 2 clockids
ntp: Remove redundant and incorrect parameter check
mn10300: Switch do_timer() to xtimer_update()
posix clocks: Introduce dynamic clocks
posix-timers: Cleanup namespace
posix-timers: Add support for fd based clocks
x86: Add clock_adjtime for x86
posix-timers: Introduce a syscall for clock tuning.
time: Splitout compat timex accessors
ntp: Add ADJ_SETOFFSET mode bit
time: Introduce timekeeping_inject_offset
posix-timer: Update comment
...
Fix up new system-call-related conflicts in
arch/x86/ia32/ia32entry.S
arch/x86/include/asm/unistd_32.h
arch/x86/include/asm/unistd_64.h
arch/x86/kernel/syscall_table_32.S
(name_to_handle_at()/open_by_handle_at() vs clock_adjtime()), and some
due to movement of get_jiffies_64() in:
kernel/time.c
I intend to turn struct flowi into a union of AF specific flowi
structs. There will be a common structure that each variant includes
first, much like struct sock_common.
This is the first step to move in that direction.
Signed-off-by: David S. Miller <davem@davemloft.net>
clean-files should be defined as a variable not a target.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Add a keyctl op (KEYCTL_INSTANTIATE_IOV) that is like KEYCTL_INSTANTIATE, but
takes an iovec array and concatenates the data in-kernel into one buffer.
Since the KEYCTL_INSTANTIATE copies the data anyway, this isn't too much of a
problem.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Add a new keyctl op to reject a key with a specified error code. This works
much the same as negating a key, and so keyctl_negate_key() is made a special
case of keyctl_reject_key(). The difference is that keyctl_negate_key()
selects ENOKEY as the error to be reported.
Typically the key would be rejected with EKEYEXPIRED, EKEYREVOKED or
EKEYREJECTED, but this is not mandatory.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Add a key type operation to permit the key type to vet the description of a new
key that key_alloc() is about to allocate. The operation may reject the
description if it wishes with an error of its choosing. If it does this, the
key will not be allocated.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Add an RCU payload dereference macro as this seems to be a common piece of code
amongst key types that use RCU referenced payloads.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Cleanups based on comments from Sam Ravnborg,
* remove references to the currently unused af_names.h
* add rlim_names.h to clean-files:
* rework cmd_make-XXX to make them more readable by adding comments,
reworking the expressions to put logical components on individual lines,
and keep lines < 80 characters.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Netlink message processing in the kernel is synchronous these days,
capabilities can be checked directly in security_netlink_recv() from
the current process.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Reviewed-by: James Morris <jmorris@namei.org>
[chrisw: update to include pohmelfs and uvesafb]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
For SELinux we do not allow security information to change during a remount
operation. Thus this hook simply strips the security module options from
the data and verifies that those are the same options as exist on the
current superblock.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
The VFS mount code passes the mount options to the LSM. The LSM will remove
options it understands from the data and the VFS will then pass the remaining
options onto the underlying filesystem. This is how options like the
SELinux context= work. The problem comes in that -o remount never calls
into LSM code. So if you include an LSM specific option it will get passed
to the filesystem and will cause the remount to fail. An example of where
this is a problem is the 'seclabel' option. The SELinux LSM hook will
print this word in /proc/mounts if the filesystem is being labeled using
xattrs. If you pass this word on mount it will be silently stripped and
ignored. But if you pass this word on remount the LSM never gets called
and it will be passed to the FS. The FS doesn't know what seclabel means
and thus should fail the mount. For example an ext3 fs mounted over loop
# mount -o loop /tmp/fs /mnt/tmp
# cat /proc/mounts | grep /mnt/tmp
/dev/loop0 /mnt/tmp ext3 rw,seclabel,relatime,errors=continue,barrier=0,data=ordered 0 0
# mount -o remount /mnt/tmp
mount: /mnt/tmp not mounted already, or bad option
# dmesg
EXT3-fs (loop0): error: unrecognized mount option "seclabel" or missing value
This patch passes the remount mount options to an new LSM hook.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
The security context for the newly created socket shares the same
user, role and MLS attribute as its creator but may have a different
type, which could be specified by a type_transition rule in the relevant
policy package.
Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
[fix call to security_transition_sid to include qstr, Eric Paris]
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
The socket SID would be computed on creation and no longer inherit
its creator's SID by default. Socket may have a different type but
needs to retain the creator's role and MLS attribute in order not
to break labeled networking and network access control.
The kernel value for a class would be used to determine if the class
if one of socket classes. If security_compute_sid is called from
userspace the policy value for a class would be mapped to the relevant
kernel value first.
Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
The security_is_socket_class() is auto-generated by genheaders based
on classmap.h to reduce maintenance effort when a new class is defined
in SELinux kernel. The name for any socket class should be suffixed by
"socket" and doesn't contain more than one substr of "socket".
Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Netlink message processing in the kernel is synchronous these days, the
session information can be collected when needed.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
In tomoyo_check_open_permission() since 2.6.36, TOMOYO was by error
recalculating already calculated pathname when checking allow_rewrite
permission. As a result, memory will leak whenever a file is opened for writing
without O_APPEND flag. Also, performance will degrade because TOMOYO is
calculating pathname regardless of profile configuration.
This patch fixes the leak and performance degrade.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
This reverts commit 242631c49d.
Conflicts:
security/selinux/hooks.c
SELinux used to recognize certain individual ioctls and check
permissions based on the knowledge of the individual ioctl. In commit
242631c49d the SELinux code stopped trying to understand
individual ioctls and to instead looked at the ioctl access bits to
determine in we should check read or write for that operation. This
same suggestion was made to SMACK (and I believe copied into TOMOYO).
But this suggestion is total rubbish. The ioctl access bits are
actually the access requirements for the structure being passed into the
ioctl, and are completely unrelated to the operation of the ioctl or the
object the ioctl is being performed upon.
Take FS_IOC_FIEMAP as an example. FS_IOC_FIEMAP is defined as:
FS_IOC_FIEMAP _IOWR('f', 11, struct fiemap)
So it has access bits R and W. What this really means is that the
kernel is going to both read and write to the struct fiemap. It has
nothing at all to do with the operations that this ioctl might perform
on the file itself!
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
These permissions are not used and can be dropped in the kernel
definitions.
Suggested-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
The IPSKB_FORWARDED and IP6SKB_FORWARDED flags are used only in the
multicast forwarding case to indicate that a packet looped back after
forward. So these flags are not a good indicator for packet forwarding.
A better indicator is the incoming interface. If we have no socket context,
but an incoming interface and we see the packet in the ip postroute hook,
the packet is going to be forwarded.
With this patch we use the incoming interface as an indicator on packet
forwarding.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
selinux_sock_rcv_skb_compat and selinux_ip_postroute_compat are just
called if selinux_policycap_netpeer is not set. However in these
functions we check if selinux_policycap_netpeer is set. This leads
to some dead code and to the fact that selinux_xfrm_postroute_last
is never executed. This patch removes the dead code and the checks
for selinux_policycap_netpeer in the compatibility functions.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
selinux_xfrm_sec_ctx_alloc accidentally checks the xfrm domain of
interpretation against the selinux context algorithm. This patch
fixes this by checking ctx_alg against the selinux context algorithm.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
The original ima_must_measure() function based its results on cached
iint information, which required an iint be allocated for all files.
Currently, an iint is allocated only for files in policy. As a result,
for those files in policy, ima_must_measure() is now called twice: once
to determine if the inode is in the measurement policy and, the second
time, to determine if it needs to be measured/re-measured.
The second call to ima_must_measure() unnecessarily checks to see if
the file is in policy. As we already know the file is in policy, this
patch removes the second unnecessary call to ima_must_measure(), removes
the vestige iint parameter, and just checks the iint directly to determine
if the inode has been measured or needs to be measured/re-measured.
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Expand security_capable() to include cred, so that it can be usable in a
wider range of call sites.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
Now that i_readcount is maintained by the VFS layer, remove the
imbalance checking in IMA. Cleans up the IMA code nicely.
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
ima_counts_get() updated the readcount and invalidated the PCR,
as necessary. Only update the i_readcount in the VFS layer.
Move the PCR invalidation checks to ima_file_check(), where it
belongs.
Maintaining the i_readcount in the VFS layer, will allow other
subsystems to use i_readcount.
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
The mmap policy enforcement checks the access of the
SMACK64MMAP subject against the current subject incorrectly.
The check as written works correctly only if the access
rules involved have the same access. This is the common
case, so initial testing did not find a problem.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
The mmap policy enforcement was not properly handling the
interaction between the global and local rule lists.
Instead of going through one and then the other, which
missed the important case where a rule specified that
there should be no access, combine the access limitations
where there is a rule in each list.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <jmorris@namei.org>
In cred_alloc_blank() since 2.6.32, abort_creds(new) is called with
new->security == NULL and new->magic == 0 when security_cred_alloc_blank()
returns an error. As a result, BUG() will be triggered if SELinux is enabled
or CONFIG_DEBUG_CREDENTIALS=y.
If CONFIG_DEBUG_CREDENTIALS=y, BUG() is called from __invalid_creds() because
cred->magic == 0. Failing that, BUG() is called from selinux_cred_free()
because selinux_cred_free() is not expecting cred->security == NULL. This does
not affect smack_cred_free(), tomoyo_cred_free() or apparmor_cred_free().
Fix these bugs by
(1) Set new->magic before calling security_cred_alloc_blank().
(2) Handle null cred->security in creds_are_invalid() and selinux_cred_free().
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Both settimeofday() and clock_settime() promise with a 'const'
attribute not to alter the arguments passed in. This patch adds the
missing 'const' attribute into the various kernel functions
implementing these calls.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Acked-by: John Stultz <johnstul@us.ibm.com>
LKML-Reference: <20110201134417.545698637@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The only user for this hook was selinux. sysctl routes every call
through /proc/sys/. Selinux and other security modules use the file
system checks for sysctl too, so no need for this hook any more.
Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>