diff options
author | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-04-16 15:20:36 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-04-16 15:20:36 -0700 |
commit | 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 (patch) | |
tree | 0bba044c4ce775e45a88a51686b5d9f90697ea9d /init | |
download | linux-1da177e4c3f41524e886b7f1b8a0c1fc7321cac2.tar.bz2 |
Linux-2.6.12-rc2v2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!
Diffstat (limited to 'init')
-rw-r--r-- | init/Kconfig | 463 | ||||
-rw-r--r-- | init/Makefile | 28 | ||||
-rw-r--r-- | init/calibrate.c | 79 | ||||
-rw-r--r-- | init/do_mounts.c | 430 | ||||
-rw-r--r-- | init/do_mounts.h | 92 | ||||
-rw-r--r-- | init/do_mounts_devfs.c | 137 | ||||
-rw-r--r-- | init/do_mounts_initrd.c | 121 | ||||
-rw-r--r-- | init/do_mounts_md.c | 290 | ||||
-rw-r--r-- | init/do_mounts_rd.c | 429 | ||||
-rw-r--r-- | init/initramfs.c | 500 | ||||
-rw-r--r-- | init/main.c | 713 | ||||
-rw-r--r-- | init/version.c | 33 |
12 files changed, 3315 insertions, 0 deletions
diff --git a/init/Kconfig b/init/Kconfig new file mode 100644 index 000000000000..abe2682a6ca6 --- /dev/null +++ b/init/Kconfig @@ -0,0 +1,463 @@ +menu "Code maturity level options" + +config EXPERIMENTAL + bool "Prompt for development and/or incomplete code/drivers" + ---help--- + Some of the various things that Linux supports (such as network + drivers, file systems, network protocols, etc.) can be in a state + of development where the functionality, stability, or the level of + testing is not yet high enough for general use. This is usually + known as the "alpha-test" phase among developers. If a feature is + currently in alpha-test, then the developers usually discourage + uninformed widespread use of this feature by the general public to + avoid "Why doesn't this work?" type mail messages. However, active + testing and use of these systems is welcomed. Just be aware that it + may not meet the normal level of reliability or it may fail to work + in some special cases. Detailed bug reports from people familiar + with the kernel internals are usually welcomed by the developers + (before submitting bug reports, please read the documents + <file:README>, <file:MAINTAINERS>, <file:REPORTING-BUGS>, + <file:Documentation/BUG-HUNTING>, and + <file:Documentation/oops-tracing.txt> in the kernel source). + + This option will also make obsoleted drivers available. These are + drivers that have been replaced by something else, and/or are + scheduled to be removed in a future kernel release. + + Unless you intend to help test and develop a feature or driver that + falls into this category, or you have a situation that requires + using these features, you should probably say N here, which will + cause the configurator to present you with fewer choices. If + you say Y here, you will be offered the choice of using features or + drivers that are currently considered to be in the alpha-test phase. + +config CLEAN_COMPILE + bool "Select only drivers expected to compile cleanly" if EXPERIMENTAL + default y + help + Select this option if you don't even want to see the option + to configure known-broken drivers. + + If unsure, say Y + +config BROKEN + bool + depends on !CLEAN_COMPILE + default y + +config BROKEN_ON_SMP + bool + depends on BROKEN || !SMP + default y + +config LOCK_KERNEL + bool + depends on SMP || PREEMPT + default y + +config INIT_ENV_ARG_LIMIT + int + default 32 if !USERMODE + default 128 if USERMODE + help + This is the value of the two limits on the number of argument and of + env.var passed to init from the kernel command line. + +endmenu + +menu "General setup" + +config LOCALVERSION + string "Local version - append to kernel release" + help + Append an extra string to the end of your kernel version. + This will show up when you type uname, for example. + The string you set here will be appended after the contents of + any files with a filename matching localversion* in your + object and source tree, in that order. Your total string can + be a maximum of 64 characters. + +config SWAP + bool "Support for paging of anonymous memory (swap)" + depends on MMU + default y + help + This option allows you to choose whether you want to have support + for socalled swap devices or swap files in your kernel that are + used to provide more virtual memory than the actual RAM present + in your computer. If unsure say Y. + +config SYSVIPC + bool "System V IPC" + depends on MMU + ---help--- + Inter Process Communication is a suite of library functions and + system calls which let processes (running programs) synchronize and + exchange information. It is generally considered to be a good thing, + and some programs won't run unless you say Y here. In particular, if + you want to run the DOS emulator dosemu under Linux (read the + DOSEMU-HOWTO, available from <http://www.tldp.org/docs.html#howto>), + you'll need to say Y here. + + You can find documentation about IPC with "info ipc" and also in + section 6.4 of the Linux Programmer's Guide, available from + <http://www.tldp.org/guides.html>. + +config POSIX_MQUEUE + bool "POSIX Message Queues" + depends on NET && EXPERIMENTAL + ---help--- + POSIX variant of message queues is a part of IPC. In POSIX message + queues every message has a priority which decides about succession + of receiving it by a process. If you want to compile and run + programs written e.g. for Solaris with use of its POSIX message + queues (functions mq_*) say Y here. To use this feature you will + also need mqueue library, available from + <http://www.mat.uni.torun.pl/~wrona/posix_ipc/> + + POSIX message queues are visible as a filesystem called 'mqueue' + and can be mounted somewhere if you want to do filesystem + operations on message queues. + + If unsure, say Y. + +config BSD_PROCESS_ACCT + bool "BSD Process Accounting" + help + If you say Y here, a user level program will be able to instruct the + kernel (via a special system call) to write process accounting + information to a file: whenever a process exits, information about + that process will be appended to the file by the kernel. The + information includes things such as creation time, owning user, + command name, memory usage, controlling terminal etc. (the complete + list is in the struct acct in <file:include/linux/acct.h>). It is + up to the user level program to do useful things with this + information. This is generally a good idea, so say Y. + +config BSD_PROCESS_ACCT_V3 + bool "BSD Process Accounting version 3 file format" + depends on BSD_PROCESS_ACCT + default n + help + If you say Y here, the process accounting information is written + in a new file format that also logs the process IDs of each + process and it's parent. Note that this file format is incompatible + with previous v0/v1/v2 file formats, so you will need updated tools + for processing it. A preliminary version of these tools is available + at <http://www.physik3.uni-rostock.de/tim/kernel/utils/acct/>. + +config SYSCTL + bool "Sysctl support" + ---help--- + The sysctl interface provides a means of dynamically changing + certain kernel parameters and variables on the fly without requiring + a recompile of the kernel or reboot of the system. The primary + interface consists of a system call, but if you say Y to "/proc + file system support", a tree of modifiable sysctl entries will be + generated beneath the /proc/sys directory. They are explained in the + files in <file:Documentation/sysctl/>. Note that enabling this + option will enlarge the kernel by at least 8 KB. + + As it is generally a good thing, you should say Y here unless + building a kernel for install/rescue disks or your system is very + limited in memory. + +config AUDIT + bool "Auditing support" + default y if SECURITY_SELINUX + help + Enable auditing infrastructure that can be used with another + kernel subsystem, such as SELinux (which requires this for + logging of avc messages output). Does not do system-call + auditing without CONFIG_AUDITSYSCALL. + +config AUDITSYSCALL + bool "Enable system-call auditing support" + depends on AUDIT && (X86 || PPC64 || ARCH_S390 || IA64) + default y if SECURITY_SELINUX + help + Enable low-overhead system-call auditing infrastructure that + can be used independently or with another kernel subsystem, + such as SELinux. + +config HOTPLUG + bool "Support for hot-pluggable devices" if !ARCH_S390 + default ARCH_S390 + help + This option is provided for the case where no in-kernel-tree + modules require HOTPLUG functionality, but a module built + outside the kernel tree does. Such modules require Y here. + +config KOBJECT_UEVENT + bool "Kernel Userspace Events" + depends on NET + default y + help + This option enables the kernel userspace event layer, which is a + simple mechanism for kernel-to-user communication over a netlink + socket. + The goal of the kernel userspace events layer is to provide a simple + and efficient events system, that notifies userspace about kobject + state changes. This will enable applications to just listen for + events instead of polling system devices and files. + Hotplug events (kobject addition and removal) are also available on + the netlink socket in addition to the execution of /sbin/hotplug if + CONFIG_HOTPLUG is enabled. + + Say Y, unless you are building a system requiring minimal memory + consumption. + +config IKCONFIG + bool "Kernel .config support" + ---help--- + This option enables the complete Linux kernel ".config" file + contents to be saved in the kernel. It provides documentation + of which kernel options are used in a running kernel or in an + on-disk kernel. This information can be extracted from the kernel + image file with the script scripts/extract-ikconfig and used as + input to rebuild the current kernel or to build another kernel. + It can also be extracted from a running kernel by reading + /proc/config.gz if enabled (below). + +config IKCONFIG_PROC + bool "Enable access to .config through /proc/config.gz" + depends on IKCONFIG && PROC_FS + ---help--- + This option enables access to the kernel configuration file + through /proc/config.gz. + +config CPUSETS + bool "Cpuset support" + depends on SMP + help + This options will let you create and manage CPUSET's which + allow dynamically partitioning a system into sets of CPUs and + Memory Nodes and assigning tasks to run only within those sets. + This is primarily useful on large SMP or NUMA systems. + + Say N if unsure. + +menuconfig EMBEDDED + bool "Configure standard kernel features (for small systems)" + help + This option allows certain base kernel options and settings + to be disabled or tweaked. This is for specialized + environments which can tolerate a "non-standard" kernel. + Only use this if you really know what you are doing. + +config KALLSYMS + bool "Load all symbols for debugging/kksymoops" if EMBEDDED + default y + help + Say Y here to let the kernel print out symbolic crash information and + symbolic stack backtraces. This increases the size of the kernel + somewhat, as all symbols have to be loaded into the kernel image. + +config KALLSYMS_ALL + bool "Include all symbols in kallsyms" + depends on DEBUG_KERNEL && KALLSYMS + help + Normally kallsyms only contains the symbols of functions, for nicer + OOPS messages. Some debuggers can use kallsyms for other + symbols too: say Y here to include all symbols, and you + don't care about adding 300k to the size of your kernel. + + Say N. + +config KALLSYMS_EXTRA_PASS + bool "Do an extra kallsyms pass" + depends on KALLSYMS + help + If kallsyms is not working correctly, the build will fail with + inconsistent kallsyms data. If that occurs, log a bug report and + turn on KALLSYMS_EXTRA_PASS which should result in a stable build. + Always say N here unless you find a bug in kallsyms, which must be + reported. KALLSYMS_EXTRA_PASS is only a temporary workaround while + you wait for kallsyms to be fixed. + +config BASE_FULL + default y + bool "Enable full-sized data structures for core" if EMBEDDED + help + Disabling this option reduces the size of miscellaneous core + kernel data structures. This saves memory on small machines, + but may reduce performance. + +config FUTEX + bool "Enable futex support" if EMBEDDED + default y + help + Disabling this option will cause the kernel to be built without + support for "fast userspace mutexes". The resulting kernel may not + run glibc-based applications correctly. + +config EPOLL + bool "Enable eventpoll support" if EMBEDDED + default y + help + Disabling this option will cause the kernel to be built without + support for epoll family of system calls. + +config CC_OPTIMIZE_FOR_SIZE + bool "Optimize for size" if EMBEDDED + default y if ARM || H8300 + help + Enabling this option will pass "-Os" instead of "-O2" to gcc + resulting in a smaller kernel. + + WARNING: some versions of gcc may generate incorrect code with this + option. If problems are observed, a gcc upgrade may be needed. + + If unsure, say N. + +config SHMEM + bool "Use full shmem filesystem" if EMBEDDED + default y + depends on MMU + help + The shmem is an internal filesystem used to manage shared memory. + It is backed by swap and manages resource limits. It is also exported + to userspace as tmpfs if TMPFS is enabled. Disabling this + option replaces shmem and tmpfs with the much simpler ramfs code, + which may be appropriate on small systems without swap. + +config CC_ALIGN_FUNCTIONS + int "Function alignment" if EMBEDDED + default 0 + help + Align the start of functions to the next power-of-two greater than n, + skipping up to n bytes. For instance, 32 aligns functions + to the next 32-byte boundary, but 24 would align to the next + 32-byte boundary only if this can be done by skipping 23 bytes or less. + Zero means use compiler's default. + +config CC_ALIGN_LABELS + int "Label alignment" if EMBEDDED + default 0 + help + Align all branch targets to a power-of-two boundary, skipping + up to n bytes like ALIGN_FUNCTIONS. This option can easily + make code slower, because it must insert dummy operations for + when the branch target is reached in the usual flow of the code. + Zero means use compiler's default. + +config CC_ALIGN_LOOPS + int "Loop alignment" if EMBEDDED + default 0 + help + Align loops to a power-of-two boundary, skipping up to n bytes. + Zero means use compiler's default. + +config CC_ALIGN_JUMPS + int "Jump alignment" if EMBEDDED + default 0 + help + Align branch targets to a power-of-two boundary, for branch + targets where the targets can only be reached by jumping, + skipping up to n bytes like ALIGN_FUNCTIONS. In this case, + no dummy operations need be executed. + Zero means use compiler's default. + +endmenu # General setup + +config TINY_SHMEM + default !SHMEM + bool + +config BASE_SMALL + int + default 0 if BASE_FULL + default 1 if !BASE_FULL + +menu "Loadable module support" + +config MODULES + bool "Enable loadable module support" + help + Kernel modules are small pieces of compiled code which can + be inserted in the running kernel, rather than being + permanently built into the kernel. You use the "modprobe" + tool to add (and sometimes remove) them. If you say Y here, + many parts of the kernel can be built as modules (by + answering M instead of Y where indicated): this is most + useful for infrequently used options which are not required + for booting. For more information, see the man pages for + modprobe, lsmod, modinfo, insmod and rmmod. + + If you say Y here, you will need to run "make + modules_install" to put the modules under /lib/modules/ + where modprobe can find them (you may need to be root to do + this). + + If unsure, say Y. + +config MODULE_UNLOAD + bool "Module unloading" + depends on MODULES + help + Without this option you will not be able to unload any + modules (note that some modules may not be unloadable + anyway), which makes your kernel slightly smaller and + simpler. If unsure, say Y. + +config MODULE_FORCE_UNLOAD + bool "Forced module unloading" + depends on MODULE_UNLOAD && EXPERIMENTAL + help + This option allows you to force a module to unload, even if the + kernel believes it is unsafe: the kernel will remove the module + without waiting for anyone to stop using it (using the -f option to + rmmod). This is mainly for kernel developers and desperate users. + If unsure, say N. + +config OBSOLETE_MODPARM + bool + default y + depends on MODULES + help + You need this option to use module parameters on modules which + have not been converted to the new module parameter system yet. + If unsure, say Y. + +config MODVERSIONS + bool "Module versioning support (EXPERIMENTAL)" + depends on MODULES && EXPERIMENTAL && !UML + help + Usually, you have to use modules compiled with your kernel. + Saying Y here makes it sometimes possible to use modules + compiled for different kernels, by adding enough information + to the modules to (hopefully) spot any changes which would + make them incompatible with the kernel you are running. If + unsure, say N. + +config MODULE_SRCVERSION_ALL + bool "Source checksum for all modules" + depends on MODULES + help + Modules which contain a MODULE_VERSION get an extra "srcversion" + field inserted into their modinfo section, which contains a + sum of the source files which made it. This helps maintainers + see exactly which source was used to build a module (since + others sometimes change the module source without updating + the version). With this option, such a "srcversion" field + will be created for all modules. If unsure, say N. + +config KMOD + bool "Automatic kernel module loading" + depends on MODULES + help + Normally when you have selected some parts of the kernel to + be created as kernel modules, you must load them (using the + "modprobe" command) before you can use them. If you say Y + here, some parts of the kernel will be able to load modules + automatically: when a part of the kernel needs a module, it + runs modprobe with the appropriate arguments, thereby + loading the module if it is available. If unsure, say Y. + +config STOP_MACHINE + bool + default y + depends on (SMP && MODULE_UNLOAD) || HOTPLUG_CPU + help + Need stop_machine() primitive. +endmenu diff --git a/init/Makefile b/init/Makefile new file mode 100644 index 000000000000..93a53fbdbe79 --- /dev/null +++ b/init/Makefile @@ -0,0 +1,28 @@ +# +# Makefile for the linux kernel. +# + +obj-y := main.o version.o mounts.o initramfs.o +obj-$(CONFIG_GENERIC_CALIBRATE_DELAY) += calibrate.o + +mounts-y := do_mounts.o +mounts-$(CONFIG_DEVFS_FS) += do_mounts_devfs.o +mounts-$(CONFIG_BLK_DEV_RAM) += do_mounts_rd.o +mounts-$(CONFIG_BLK_DEV_INITRD) += do_mounts_initrd.o +mounts-$(CONFIG_BLK_DEV_MD) += do_mounts_md.o + +# files to be removed upon make clean +clean-files := ../include/linux/compile.h + +# dependencies on generated files need to be listed explicitly + +$(obj)/version.o: include/linux/compile.h + +# compile.h changes depending on hostname, generation number, etc, +# so we regenerate it always. +# mkcompile_h will make sure to only update the +# actual file if its content has changed. + +include/linux/compile.h: FORCE + @echo ' CHK $@' + @$(CONFIG_SHELL) $(srctree)/scripts/mkcompile_h $@ "$(UTS_MACHINE)" "$(CONFIG_SMP)" "$(CC) $(CFLAGS)" diff --git a/init/calibrate.c b/init/calibrate.c new file mode 100644 index 000000000000..c698e04a3dbe --- /dev/null +++ b/init/calibrate.c @@ -0,0 +1,79 @@ +/* calibrate.c: default delay calibration + * + * Excised from init/main.c + * Copyright (C) 1991, 1992 Linus Torvalds + */ + +#include <linux/sched.h> +#include <linux/delay.h> +#include <linux/init.h> + +static unsigned long preset_lpj; +static int __init lpj_setup(char *str) +{ + preset_lpj = simple_strtoul(str,NULL,0); + return 1; +} + +__setup("lpj=", lpj_setup); + +/* + * This is the number of bits of precision for the loops_per_jiffy. Each + * bit takes on average 1.5/HZ seconds. This (like the original) is a little + * better than 1% + */ +#define LPS_PREC 8 + +void __devinit calibrate_delay(void) +{ + unsigned long ticks, loopbit; + int lps_precision = LPS_PREC; + + if (preset_lpj) { + loops_per_jiffy = preset_lpj; + printk("Calibrating delay loop (skipped)... " + "%lu.%02lu BogoMIPS preset\n", + loops_per_jiffy/(500000/HZ), + (loops_per_jiffy/(5000/HZ)) % 100); + } else { + loops_per_jiffy = (1<<12); + + printk(KERN_DEBUG "Calibrating delay loop... "); + while ((loops_per_jiffy <<= 1) != 0) { + /* wait for "start of" clock tick */ + ticks = jiffies; + while (ticks == jiffies) + /* nothing */; + /* Go .. */ + ticks = jiffies; + __delay(loops_per_jiffy); + ticks = jiffies - ticks; + if (ticks) + break; + } + + /* + * Do a binary approximation to get loops_per_jiffy set to + * equal one clock (up to lps_precision bits) + */ + loops_per_jiffy >>= 1; + loopbit = loops_per_jiffy; + while (lps_precision-- && (loopbit >>= 1)) { + loops_per_jiffy |= loopbit; + ticks = jiffies; + while (ticks == jiffies) + /* nothing */; + ticks = jiffies; + __delay(loops_per_jiffy); + if (jiffies != ticks) /* longer than 1 tick */ + loops_per_jiffy &= ~loopbit; + } + + /* Round the value and print it */ + printk("%lu.%02lu BogoMIPS (lpj=%lu)\n", + loops_per_jiffy/(500000/HZ), + (loops_per_jiffy/(5000/HZ)) % 100, + loops_per_jiffy); + } + +} diff --git a/init/do_mounts.c b/init/do_mounts.c new file mode 100644 index 000000000000..b7570c074d0f --- /dev/null +++ b/init/do_mounts.c @@ -0,0 +1,430 @@ +#include <linux/module.h> +#include <linux/sched.h> +#include <linux/ctype.h> +#include <linux/fd.h> +#include <linux/tty.h> +#include <linux/suspend.h> +#include <linux/root_dev.h> +#include <linux/security.h> +#include <linux/delay.h> + +#include <linux/nfs_fs.h> +#include <linux/nfs_fs_sb.h> +#include <linux/nfs_mount.h> + +#include "do_mounts.h" + +extern int get_filesystem_list(char * buf); + +int __initdata rd_doload; /* 1 = load RAM disk, 0 = don't load */ + +int root_mountflags = MS_RDONLY | MS_VERBOSE; +char * __initdata root_device_name; +static char __initdata saved_root_name[64]; + +/* this is initialized in init/main.c */ +dev_t ROOT_DEV; + +EXPORT_SYMBOL(ROOT_DEV); + +static int __init load_ramdisk(char *str) +{ + rd_doload = simple_strtol(str,NULL,0) & 3; + return 1; +} +__setup("load_ramdisk=", load_ramdisk); + +static int __init readonly(char *str) +{ + if (*str) + return 0; + root_mountflags |= MS_RDONLY; + return 1; +} + +static int __init readwrite(char *str) +{ + if (*str) + return 0; + root_mountflags &= ~MS_RDONLY; + return 1; +} + +__setup("ro", readonly); +__setup("rw", readwrite); + +static dev_t try_name(char *name, int part) +{ + char path[64]; + char buf[32]; + int range; + dev_t res; + char *s; + int len; + int fd; + unsigned int maj, min; + + /* read device number from .../dev */ + + sprintf(path, "/sys/block/%s/dev", name); + fd = sys_open(path, 0, 0); + if (fd < 0) + goto fail; + len = sys_read(fd, buf, 32); + sys_close(fd); + if (len <= 0 || len == 32 || buf[len - 1] != '\n') + goto fail; + buf[len - 1] = '\0'; + if (sscanf(buf, "%u:%u", &maj, &min) == 2) { + /* + * Try the %u:%u format -- see print_dev_t() + */ + res = MKDEV(maj, min); + if (maj != MAJOR(res) || min != MINOR(res)) + goto fail; + } else { + /* + * Nope. Try old-style "0321" + */ + res = new_decode_dev(simple_strtoul(buf, &s, 16)); + if (*s) + goto fail; + } + + /* if it's there and we are not looking for a partition - that's it */ + if (!part) + return res; + + /* otherwise read range from .../range */ + sprintf(path, "/sys/block/%s/range", name); + fd = sys_open(path, 0, 0); + if (fd < 0) + goto fail; + len = sys_read(fd, buf, 32); + sys_close(fd); + if (len <= 0 || len == 32 || buf[len - 1] != '\n') + goto fail; + buf[len - 1] = '\0'; + range = simple_strtoul(buf, &s, 10); + if (*s) + goto fail; + + /* if partition is within range - we got it */ + if (part < range) + return res + part; +fail: + return 0; +} + +/* + * Convert a name into device number. We accept the following variants: + * + * 1) device number in hexadecimal represents itself + * 2) /dev/nfs represents Root_NFS (0xff) + * 3) /dev/<disk_name> represents the device number of disk + * 4) /dev/<disk_name><decimal> represents the device number + * of partition - device number of disk plus the partition number + * 5) /dev/<disk_name>p<decimal> - same as the above, that form is + * used when disk name of partitioned disk ends on a digit. + * + * If name doesn't have fall into the categories above, we return 0. + * Driverfs is used to check if something is a disk name - it has + * all known disks under bus/block/devices. If the disk name + * contains slashes, name of driverfs node has them replaced with + * bangs. try_name() does the actual checks, assuming that driverfs + * is mounted on rootfs /sys. + */ + +dev_t name_to_dev_t(char *name) +{ + char s[32]; + char *p; + dev_t res = 0; + int part; + +#ifdef CONFIG_SYSFS + int mkdir_err = sys_mkdir("/sys", 0700); + if (sys_mount("sysfs", "/sys", "sysfs", 0, NULL) < 0) + goto out; +#endif + + if (strncmp(name, "/dev/", 5) != 0) { + unsigned maj, min; + + if (sscanf(name, "%u:%u", &maj, &min) == 2) { + res = MKDEV(maj, min); + if (maj != MAJOR(res) || min != MINOR(res)) + goto fail; + } else { + res = new_decode_dev(simple_strtoul(name, &p, 16)); + if (*p) + goto fail; + } + goto done; + } + name += 5; + res = Root_NFS; + if (strcmp(name, "nfs") == 0) + goto done; + res = Root_RAM0; + if (strcmp(name, "ram") == 0) + goto done; + + if (strlen(name) > 31) + goto fail; + strcpy(s, name); + for (p = s; *p; p++) + if (*p == '/') + *p = '!'; + res = try_name(s, 0); + if (res) + goto done; + + while (p > s && isdigit(p[-1])) + p--; + if (p == s || !*p || *p == '0') + goto fail; + part = simple_strtoul(p, NULL, 10); + *p = '\0'; + res = try_name(s, part); + if (res) + goto done; + + if (p < s + 2 || !isdigit(p[-2]) || p[-1] != 'p') + goto fail; + p[-1] = '\0'; + res = try_name(s, part); +done: +#ifdef CONFIG_SYSFS + sys_umount("/sys", 0); +out: + if (!mkdir_err) + sys_rmdir("/sys"); +#endif + return res; +fail: + res = 0; + goto done; +} + +static int __init root_dev_setup(char *line) +{ + strlcpy(saved_root_name, line, sizeof(saved_root_name)); + return 1; +} + +__setup("root=", root_dev_setup); + +static char * __initdata root_mount_data; +static int __init root_data_setup(char *str) +{ + root_mount_data = str; + return 1; +} + +static char * __initdata root_fs_names; +static int __init fs_names_setup(char *str) +{ + root_fs_names = str; + return 1; +} + +static unsigned int __initdata root_delay; +static int __init root_delay_setup(char *str) +{ + root_delay = simple_strtoul(str, NULL, 0); + return 1; +} + +__setup("rootflags=", root_data_setup); +__setup("rootfstype=", fs_names_setup); +__setup("rootdelay=", root_delay_setup); + +static void __init get_fs_names(char *page) +{ + char *s = page; + + if (root_fs_names) { + strcpy(page, root_fs_names); + while (*s++) { + if (s[-1] == ',') + s[-1] = '\0'; + } + } else { + int len = get_filesystem_list(page); + char *p, *next; + + page[len] = '\0'; + for (p = page-1; p; p = next) { + next = strchr(++p, '\n'); + if (*p++ != '\t') + continue; + while ((*s++ = *p++) != '\n') + ; + s[-1] = '\0'; + } + } + *s = '\0'; +} + +static int __init do_mount_root(char *name, char *fs, int flags, void *data) +{ + int err = sys_mount(name, "/root", fs, flags, data); + if (err) + return err; + + sys_chdir("/root"); + ROOT_DEV = current->fs->pwdmnt->mnt_sb->s_dev; + printk("VFS: Mounted root (%s filesystem)%s.\n", + current->fs->pwdmnt->mnt_sb->s_type->name, + current->fs->pwdmnt->mnt_sb->s_flags & MS_RDONLY ? + " readonly" : ""); + return 0; +} + +void __init mount_block_root(char *name, int flags) +{ + char *fs_names = __getname(); + char *p; + char b[BDEVNAME_SIZE]; + + get_fs_names(fs_names); +retry: + for (p = fs_names; *p; p += strlen(p)+1) { + int err = do_mount_root(name, p, flags, root_mount_data); + switch (err) { + case 0: + goto out; + case -EACCES: + flags |= MS_RDONLY; + goto retry; + case -EINVAL: + continue; + } + /* + * Allow the user to distinguish between failed sys_open + * and bad superblock on root device. + */ + __bdevname(ROOT_DEV, b); + printk("VFS: Cannot open root device \"%s\" or %s\n", + root_device_name, b); + printk("Please append a correct \"root=\" boot option\n"); + + panic("VFS: Unable to mount root fs on %s", b); + } + panic("VFS: Unable to mount root fs on %s", __bdevname(ROOT_DEV, b)); +out: + putname(fs_names); +} + +#ifdef CONFIG_ROOT_NFS +static int __init mount_nfs_root(void) +{ + void *data = nfs_root_data(); + + create_dev("/dev/root", ROOT_DEV, NULL); + if (data && + do_mount_root("/dev/root", "nfs", root_mountflags, data) == 0) + return 1; + return 0; +} +#endif + +#if defined(CONFIG_BLK_DEV_RAM) || defined(CONFIG_BLK_DEV_FD) +void __init change_floppy(char *fmt, ...) +{ + struct termios termios; + char buf[80]; + char c; + int fd; + va_list args; + va_start(args, fmt); + vsprintf(buf, fmt, args); + va_end(args); + fd = sys_open("/dev/root", O_RDWR | O_NDELAY, 0); + if (fd >= 0) { + sys_ioctl(fd, FDEJECT, 0); + sys_close(fd); + } + printk(KERN_NOTICE "VFS: Insert %s and press ENTER\n", buf); + fd = sys_open("/dev/console", O_RDWR, 0); + if (fd >= 0) { + sys_ioctl(fd, TCGETS, (long)&termios); + termios.c_lflag &= ~ICANON; + sys_ioctl(fd, TCSETSF, (long)&termios); + sys_read(fd, &c, 1); + termios.c_lflag |= ICANON; + sys_ioctl(fd, TCSETSF, (long)&termios); + sys_close(fd); + } +} +#endif + +void __init mount_root(void) +{ +#ifdef CONFIG_ROOT_NFS + if (MAJOR(ROOT_DEV) == UNNAMED_MAJOR) { + if (mount_nfs_root()) + return; + + printk(KERN_ERR "VFS: Unable to mount root fs via NFS, trying floppy.\n"); + ROOT_DEV = Root_FD0; + } +#endif +#ifdef CONFIG_BLK_DEV_FD + if (MAJOR(ROOT_DEV) == FLOPPY_MAJOR) { + /* rd_doload is 2 for a dual initrd/ramload setup */ + if (rd_doload==2) { + if (rd_load_disk(1)) { + ROOT_DEV = Root_RAM1; + root_device_name = NULL; + } + } else + change_floppy("root floppy"); + } +#endif + create_dev("/dev/root", ROOT_DEV, root_device_name); + mount_block_root("/dev/root", root_mountflags); +} + +/* + * Prepare the namespace - decide what/where to mount, load ramdisks, etc. + */ +void __init prepare_namespace(void) +{ + int is_floppy; + + mount_devfs(); + + if (root_delay) { + printk(KERN_INFO "Waiting %dsec before mounting root device...\n", + root_delay); + ssleep(root_delay); + } + + md_run_setup(); + + if (saved_root_name[0]) { + root_device_name = saved_root_name; + ROOT_DEV = name_to_dev_t(root_device_name); + if (strncmp(root_device_name, "/dev/", 5) == 0) + root_device_name += 5; + } + + is_floppy = MAJOR(ROOT_DEV) == FLOPPY_MAJOR; + + if (initrd_load()) + goto out; + + if (is_floppy && rd_doload && rd_load_disk(0)) + ROOT_DEV = Root_RAM0; + + mount_root(); +out: + umount_devfs("/dev"); + sys_mount(".", "/", NULL, MS_MOVE, NULL); + sys_chroot("."); + security_sb_post_mountroot(); + mount_devfs_fs (); +} + diff --git a/init/do_mounts.h b/init/do_mounts.h new file mode 100644 index 000000000000..de92bee4f35e --- /dev/null +++ b/init/do_mounts.h @@ -0,0 +1,92 @@ +#include <linux/config.h> +#include <linux/kernel.h> +#include <linux/devfs_fs_kernel.h> +#include <linux/init.h> +#include <linux/syscalls.h> +#include <linux/unistd.h> +#include <linux/slab.h> +#include <linux/mount.h> +#include <linux/major.h> +#include <linux/root_dev.h> + +dev_t name_to_dev_t(char *name); +void change_floppy(char *fmt, ...); +void mount_block_root(char *name, int flags); +void mount_root(void); +extern int root_mountflags; +extern char *root_device_name; + +#ifdef CONFIG_DEVFS_FS + +void mount_devfs(void); +void umount_devfs(char *path); +int create_dev(char *name, dev_t dev, char *devfs_name); + +#else + +static inline void mount_devfs(void) {} +static inline void umount_devfs(const char *path) {} + +static inline int create_dev(char *name, dev_t dev, char *devfs_name) +{ + sys_unlink(name); + return sys_mknod(name, S_IFBLK|0600, new_encode_dev(dev)); +} + +#endif + +#if BITS_PER_LONG == 32 +static inline u32 bstat(char *name) +{ + struct stat64 stat; + if (sys_stat64(name, &stat) != 0) + return 0; + if (!S_ISBLK(stat.st_mode)) + return 0; + if (stat.st_rdev != (u32)stat.st_rdev) + return 0; + return stat.st_rdev; +} +#else +static inline u32 bstat(char *name) +{ + struct stat stat; + if (sys_newstat(name, &stat) != 0) + return 0; + if (!S_ISBLK(stat.st_mode)) + return 0; + return stat.st_rdev; +} +#endif + +#ifdef CONFIG_BLK_DEV_RAM + +int __init rd_load_disk(int n); +int __init rd_load_image(char *from); + +#else + +static inline int rd_load_disk(int n) { return 0; } +static inline int rd_load_image(char *from) { return 0; } + +#endif + +#ifdef CONFIG_BLK_DEV_INITRD + +int __init initrd_load(void); + +#else + +static inline int initrd_load(void) { return 0; } + +#endif + +#ifdef CONFIG_BLK_DEV_MD + +void md_run_setup(void); + +#else + +static inline void md_run_setup(void) {} + +#endif diff --git a/init/do_mounts_devfs.c b/init/do_mounts_devfs.c new file mode 100644 index 000000000000..cc526474690a --- /dev/null +++ b/init/do_mounts_devfs.c @@ -0,0 +1,137 @@ + +#include <linux/kernel.h> +#include <linux/dirent.h> +#include <linux/string.h> + +#include "do_mounts.h" + +void __init mount_devfs(void) +{ + sys_mount("devfs", "/dev", "devfs", 0, NULL); +} + +void __init umount_devfs(char *path) +{ + sys_umount(path, 0); +} + +/* + * If the dir will fit in *buf, return its length. If it won't fit, return + * zero. Return -ve on error. + */ +static int __init do_read_dir(int fd, void *buf, int len) +{ + long bytes, n; + char *p = buf; + sys_lseek(fd, 0, 0); + + for (bytes = 0; bytes < len; bytes += n) { + n = sys_getdents64(fd, (struct linux_dirent64 *)(p + bytes), + len - bytes); + if (n < 0) + return n; + if (n == 0) + return bytes; + } + return 0; +} + +/* + * Try to read all of a directory. Returns the contents at *p, which + * is kmalloced memory. Returns the number of bytes read at *len. Returns + * NULL on error. + */ +static void * __init read_dir(char *path, int *len) +{ + int size; + int fd = sys_open(path, 0, 0); + + *len = 0; + if (fd < 0) + return NULL; + + for (size = 1 << 9; size <= (PAGE_SIZE << MAX_ORDER); size <<= 1) { + void *p = kmalloc(size, GFP_KERNEL); + int n; + if (!p) + break; + n = do_read_dir(fd, p, size); + if (n > 0) { + sys_close(fd); + *len = n; + return p; + } + kfree(p); + if (n == -EINVAL) + continue; /* Try a larger buffer */ + if (n < 0) + break; + } + sys_close(fd); + return NULL; +} + +/* + * recursively scan <path>, looking for a device node of type <dev> + */ +static int __init find_in_devfs(char *path, unsigned dev) +{ + char *end = path + strlen(path); + int rest = path + 64 - end; + int size; + char *p = read_dir(path, &size); + char *s; + + if (!p) + return -1; + for (s = p; s < p + size; s += ((struct linux_dirent64 *)s)->d_reclen) { + struct linux_dirent64 *d = (struct linux_dirent64 *)s; + if (strlen(d->d_name) + 2 > rest) + continue; + switch (d->d_type) { + case DT_BLK: + sprintf(end, "/%s", d->d_name); + if (bstat(path) != dev) + break; + kfree(p); + return 0; + case DT_DIR: + if (strcmp(d->d_name, ".") == 0) + break; + if (strcmp(d->d_name, "..") == 0) + break; + sprintf(end, "/%s", d->d_name); + if (find_in_devfs(path, dev) < 0) + break; + kfree(p); + return 0; + } + } + kfree(p); + return -1; +} + +/* + * create a device node called <name> which points to + * <devfs_name> if possible, otherwise find a device node + * which matches <dev> and make <name> a symlink pointing to it. + */ +int __init create_dev(char *name, dev_t dev, char *devfs_name) +{ + char path[64]; + + sys_unlink(name); + if (devfs_name && devfs_name[0]) { + if (strncmp(devfs_name, "/dev/", 5) == 0) + devfs_name += 5; + sprintf(path, "/dev/%s", devfs_name); + if (sys_access(path, 0) == 0) + return sys_symlink(devfs_name, name); + } + if (!dev) + return -1; + strcpy(path, "/dev"); + if (find_in_devfs(path, new_encode_dev(dev)) < 0) + return -1; + return sys_symlink(path + 5, name); +} diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c new file mode 100644 index 000000000000..07e7d31f2d0b --- /dev/null +++ b/init/do_mounts_initrd.c @@ -0,0 +1,121 @@ +#define __KERNEL_SYSCALLS__ +#include <linux/unistd.h> +#include <linux/kernel.h> +#include <linux/fs.h> +#include <linux/minix_fs.h> +#include <linux/ext2_fs.h> +#include <linux/romfs_fs.h> +#include <linux/initrd.h> +#include <linux/sched.h> + +#include "do_mounts.h" + +unsigned long initrd_start, initrd_end; +int initrd_below_start_ok; +unsigned int real_root_dev; /* do_proc_dointvec cannot handle kdev_t */ +static int __initdata old_fd, root_fd; +static int __initdata mount_initrd = 1; + +static int __init no_initrd(char *str) +{ + mount_initrd = 0; + return 1; +} + +__setup("noinitrd", no_initrd); + +static int __init do_linuxrc(void * shell) +{ + static char *argv[] = { "linuxrc", NULL, }; + extern char * envp_init[]; + + sys_close(old_fd);sys_close(root_fd); + sys_close(0);sys_close(1);sys_close(2); + sys_setsid(); + (void) sys_open("/dev/console",O_RDWR,0); + (void) sys_dup(0); + (void) sys_dup(0); + return execve(shell, argv, envp_init); +} + +static void __init handle_initrd(void) +{ + int error; + int i, pid; + + real_root_dev = new_encode_dev(ROOT_DEV); + create_dev("/dev/root.old", Root_RAM0, NULL); + /* mount initrd on rootfs' /root */ + mount_block_root("/dev/root.old", root_mountflags & ~MS_RDONLY); + sys_mkdir("/old", 0700); + root_fd = sys_open("/", 0, 0); + old_fd = sys_open("/old", 0, 0); + /* move initrd over / and chdir/chroot in initrd root */ + sys_chdir("/root"); + sys_mount(".", "/", NULL, MS_MOVE, NULL); + sys_chroot("."); + mount_devfs_fs (); + + pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD); + if (pid > 0) { + while (pid != sys_wait4(-1, &i, 0, NULL)) + yield(); + } + + /* move initrd to rootfs' /old */ + sys_fchdir(old_fd); + sys_mount("/", ".", NULL, MS_MOVE, NULL); + /* switch root and cwd back to / of rootfs */ + sys_fchdir(root_fd); + sys_chroot("."); + sys_close(old_fd); + sys_close(root_fd); + umount_devfs("/old/dev"); + + if (new_decode_dev(real_root_dev) == Root_RAM0) { + sys_chdir("/old"); + return; + } + + ROOT_DEV = new_decode_dev(real_root_dev); + mount_root(); + + printk(KERN_NOTICE "Trying to move old root to /initrd ... "); + error = sys_mount("/old", "/root/initrd", NULL, MS_MOVE, NULL); + if (!error) + printk("okay\n"); + else { + int fd = sys_open("/dev/root.old", O_RDWR, 0); + printk("failed\n"); + printk(KERN_NOTICE "Unmounting old root\n"); + sys_umount("/old", MNT_DETACH); + printk(KERN_NOTICE "Trying to free ramdisk memory ... "); + if (fd < 0) { + error = fd; + } else { + error = sys_ioctl(fd, BLKFLSBUF, 0); + sys_close(fd); + } + printk(!error ? "okay\n" : "failed\n"); + } +} + +int __init initrd_load(void) +{ + if (mount_initrd) { + create_dev("/dev/ram", Root_RAM0, NULL); + /* + * Load the initrd data into /dev/ram0. Execute it as initrd + * unless /dev/ram0 is supposed to be our actual root device, + * in that case the ram disk is just set up here, and gets + * mounted in the normal path. + */ + if (rd_load_image("/initrd.image") && ROOT_DEV != Root_RAM0) { + sys_unlink("/initrd.image"); + handle_initrd(); + return 1; + } + } + sys_unlink("/initrd.image"); + return 0; +} diff --git a/init/do_mounts_md.c b/init/do_mounts_md.c new file mode 100644 index 000000000000..3fbc3555ce96 --- /dev/null +++ b/init/do_mounts_md.c @@ -0,0 +1,290 @@ + +#include <linux/raid/md.h> + +#include "do_mounts.h" + +/* + * When md (and any require personalities) are compiled into the kernel + * (not a module), arrays can be assembles are boot time using with AUTODETECT + * where specially marked partitions are registered with md_autodetect_dev(), + * and with MD_BOOT where devices to be collected are given on the boot line + * with md=..... + * The code for that is here. + */ + +static int __initdata raid_noautodetect, raid_autopart; + +static struct { + int minor; + int partitioned; + int pers; + int chunk; + char *device_names; +} md_setup_args[MAX_MD_DEVS] __initdata; + +static int md_setup_ents __initdata; + +extern int mdp_major; +/* + * Parse the command-line parameters given our kernel, but do not + * actually try to invoke the MD device now; that is handled by + * md_setup_drive after the low-level disk drivers have initialised. + * + * 27/11/1999: Fixed to work correctly with the 2.3 kernel (which + * assigns the task of parsing integer arguments to the + * invoked program now). Added ability to initialise all + * the MD devices (by specifying multiple "md=" lines) + * instead of just one. -- KTK + * 18May2000: Added support for persistent-superblock arrays: + * md=n,0,factor,fault,device-list uses RAID0 for device n + * md=n,-1,factor,fault,device-list uses LINEAR for device n + * md=n,device-list reads a RAID superblock from the devices + * elements in device-list are read by name_to_kdev_t so can be + * a hex number or something like /dev/hda1 /dev/sdb + * 2001-06-03: Dave Cinege <dcinege@psychosis.com> + * Shifted name_to_kdev_t() and related operations to md_set_drive() + * for later execution. Rewrote section to make devfs compatible. + */ +static int __init md_setup(char *str) +{ + int minor, level, factor, fault, pers, partitioned = 0; + char *pername = ""; + char *str1; + int ent; + + if (*str == 'd') { + partitioned = 1; + str++; + } + if (get_option(&str, &minor) != 2) { /* MD Number */ + printk(KERN_WARNING "md: Too few arguments supplied to md=.\n"); + return 0; + } + str1 = str; + if (minor >= MAX_MD_DEVS) { + printk(KERN_WARNING "md: md=%d, Minor device number too high.\n", minor); + return 0; + } + for (ent=0 ; ent< md_setup_ents ; ent++) + if (md_setup_args[ent].minor == minor && + md_setup_args[ent].partitioned == partitioned) { + printk(KERN_WARNING "md: md=%s%d, Specified more than once. " + "Replacing previous definition.\n", partitioned?"d":"", minor); + break; + } + if (ent >= MAX_MD_DEVS) { + printk(KERN_WARNING "md: md=%s%d - too many md initialisations\n", partitioned?"d":"", minor); + return 0; + } + if (ent >= md_setup_ents) + md_setup_ents++; + switch (get_option(&str, &level)) { /* RAID Personality */ + case 2: /* could be 0 or -1.. */ + if (level == 0 || level == LEVEL_LINEAR) { + if (get_option(&str, &factor) != 2 || /* Chunk Size */ + get_option(&str, &fault) != 2) { + printk(KERN_WARNING "md: Too few arguments supplied to md=.\n"); + return 0; + } + md_setup_args[ent].pers = level; + md_setup_args[ent].chunk = 1 << (factor+12); + if (level == LEVEL_LINEAR) { + pers = LINEAR; + pername = "linear"; + } else { + pers = RAID0; + pername = "raid0"; + } + md_setup_args[ent].pers = pers; + break; + } + /* FALL THROUGH */ + case 1: /* the first device is numeric */ + str = str1; + /* FALL THROUGH */ + case 0: + md_setup_args[ent].pers = 0; + pername="super-block"; + } + + printk(KERN_INFO "md: Will configure md%d (%s) from %s, below.\n", + minor, pername, str); + md_setup_args[ent].device_names = str; + md_setup_args[ent].partitioned = partitioned; + md_setup_args[ent].minor = minor; + + return 1; +} + +#define MdpMinorShift 6 + +static void __init md_setup_drive(void) +{ + int minor, i, ent, partitioned; + dev_t dev; + dev_t devices[MD_SB_DISKS+1]; + + for (ent = 0; ent < md_setup_ents ; ent++) { + int fd; + int err = 0; + char *devname; + mdu_disk_info_t dinfo; + char name[16], devfs_name[16]; + + minor = md_setup_args[ent].minor; + partitioned = md_setup_args[ent].partitioned; + devname = md_setup_args[ent].device_names; + + sprintf(name, "/dev/md%s%d", partitioned?"_d":"", minor); + sprintf(devfs_name, "/dev/md/%s%d", partitioned?"d":"", minor); + if (partitioned) + dev = MKDEV(mdp_major, minor << MdpMinorShift); + else + dev = MKDEV(MD_MAJOR, minor); + create_dev(name, dev, devfs_name); + for (i = 0; i < MD_SB_DISKS && devname != 0; i++) { + char *p; + char comp_name[64]; + u32 rdev; + + p = strchr(devname, ','); + if (p) + *p++ = 0; + + dev = name_to_dev_t(devname); + if (strncmp(devname, "/dev/", 5) == 0) + devname += 5; + snprintf(comp_name, 63, "/dev/%s", devname); + rdev = bstat(comp_name); + if (rdev) + dev = new_decode_dev(rdev); + if (!dev) { + printk(KERN_WARNING "md: Unknown device name: %s\n", devname); + break; + } + + devices[i] = dev; + + devname = p; + } + devices[i] = 0; + + if (!i) + continue; + + printk(KERN_INFO "md: Loading md%s%d: %s\n", + partitioned ? "_d" : "", minor, + md_setup_args[ent].device_names); + + fd = sys_open(name, 0, 0); + if (fd < 0) { + printk(KERN_ERR "md: open failed - cannot start " + "array %s\n", name); + continue; + } + if (sys_ioctl(fd, SET_ARRAY_INFO, 0) == -EBUSY) { + printk(KERN_WARNING + "md: Ignoring md=%d, already autodetected. (Use raid=noautodetect)\n", + minor); + sys_close(fd); + continue; + } + + if (md_setup_args[ent].pers) { + /* non-persistent */ + mdu_array_info_t ainfo; + ainfo.level = pers_to_level(md_setup_args[ent].pers); + ainfo.size = 0; + ainfo.nr_disks =0; + ainfo.raid_disks =0; + while (devices[ainfo.raid_disks]) + ainfo.raid_disks++; + ainfo.md_minor =minor; + ainfo.not_persistent = 1; + + ainfo.state = (1 << MD_SB_CLEAN); + ainfo.layout = 0; + ainfo.chunk_size = md_setup_args[ent].chunk; + err = sys_ioctl(fd, SET_ARRAY_INFO, (long)&ainfo); + for (i = 0; !err && i <= MD_SB_DISKS; i++) { + dev = devices[i]; + if (!dev) + break; + dinfo.number = i; + dinfo.raid_disk = i; + dinfo.state = (1<<MD_DISK_ACTIVE)|(1<<MD_DISK_SYNC); + dinfo.major = MAJOR(dev); + dinfo.minor = MINOR(dev); + err = sys_ioctl(fd, ADD_NEW_DISK, (long)&dinfo); + } + } else { + /* persistent */ + for (i = 0; i <= MD_SB_DISKS; i++) { + dev = devices[i]; + if (!dev) + break; + dinfo.major = MAJOR(dev); + dinfo.minor = MINOR(dev); + sys_ioctl(fd, ADD_NEW_DISK, (long)&dinfo); + } + } + if (!err) + err = sys_ioctl(fd, RUN_ARRAY, 0); + if (err) + printk(KERN_WARNING "md: starting md%d failed\n", minor); + else { + /* reread the partition table. + * I (neilb) and not sure why this is needed, but I cannot + * boot a kernel with devfs compiled in from partitioned md + * array without it + */ + sys_close(fd); + fd = sys_open(name, 0, 0); + sys_ioctl(fd, BLKRRPART, 0); + } + sys_close(fd); + } +} + +static int __init raid_setup(char *str) +{ + int len, pos; + + len = strlen(str) + 1; + pos = 0; + + while (pos < len) { + char *comma = strchr(str+pos, ','); + int wlen; + if (comma) + wlen = (comma-str)-pos; + else wlen = (len-1)-pos; + + if (!strncmp(str, "noautodetect", wlen)) + raid_noautodetect = 1; + if (strncmp(str, "partitionable", wlen)==0) + raid_autopart = 1; + if (strncmp(str, "part", wlen)==0) + raid_autopart = 1; + pos += wlen+1; + } + return 1; +} + +__setup("raid=", raid_setup); +__setup("md=", md_setup); + +void __init md_run_setup(void) +{ + create_dev("/dev/md0", MKDEV(MD_MAJOR, 0), "md/0"); + if (raid_noautodetect) + printk(KERN_INFO "md: Skipping autodetection of RAID arrays. (raid=noautodetect)\n"); + else { + int fd = sys_open("/dev/md0", 0, 0); + if (fd >= 0) { + sys_ioctl(fd, RAID_AUTORUN, raid_autopart); + sys_close(fd); + } + } + md_setup_drive(); +} diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c new file mode 100644 index 000000000000..c10b08a80982 --- /dev/null +++ b/init/do_mounts_rd.c @@ -0,0 +1,429 @@ + +#include <linux/kernel.h> +#include <linux/fs.h> +#include <linux/minix_fs.h> +#include <linux/ext2_fs.h> +#include <linux/romfs_fs.h> +#include <linux/cramfs_fs.h> +#include <linux/initrd.h> +#include <linux/string.h> + +#include "do_mounts.h" + +#define BUILD_CRAMDISK + +int __initdata rd_prompt = 1;/* 1 = prompt for RAM disk, 0 = don't prompt */ + +static int __init prompt_ramdisk(char *str) +{ + rd_prompt = simple_strtol(str,NULL,0) & 1; + return 1; +} +__setup("prompt_ramdisk=", prompt_ramdisk); + +int __initdata rd_image_start; /* starting block # of image */ + +static int __init ramdisk_start_setup(char *str) +{ + rd_image_start = simple_strtol(str,NULL,0); + return 1; +} +__setup("ramdisk_start=", ramdisk_start_setup); + +static int __init crd_load(int in_fd, int out_fd); + +/* + * This routine tries to find a RAM disk image to load, and returns the + * number of blocks to read for a non-compressed image, 0 if the image + * is a compressed image, and -1 if an image with the right magic + * numbers could not be found. + * + * We currently check for the following magic numbers: + * minix + * ext2 + * romfs + * cramfs + * gzip + */ +static int __init +identify_ramdisk_image(int fd, int start_block) +{ + const int size = 512; + struct minix_super_block *minixsb; + struct ext2_super_block *ext2sb; + struct romfs_super_block *romfsb; + struct cramfs_super *cramfsb; + int nblocks = -1; + unsigned char *buf; + + buf = kmalloc(size, GFP_KERNEL); + if (buf == 0) + return -1; + + minixsb = (struct minix_super_block *) buf; + ext2sb = (struct ext2_super_block *) buf; + romfsb = (struct romfs_super_block *) buf; + cramfsb = (struct cramfs_super *) buf; + memset(buf, 0xe5, size); + + /* + * Read block 0 to test for gzipped kernel + */ + sys_lseek(fd, start_block * BLOCK_SIZE, 0); + sys_read(fd, buf, size); + + /* + * If it matches the gzip magic numbers, return -1 + */ + if (buf[0] == 037 && ((buf[1] == 0213) || (buf[1] == 0236))) { + printk(KERN_NOTICE + "RAMDISK: Compressed image found at block %d\n", + start_block); + nblocks = 0; + goto done; + } + + /* romfs is at block zero too */ + if (romfsb->word0 == ROMSB_WORD0 && + romfsb->word1 == ROMSB_WORD1) { + printk(KERN_NOTICE + "RAMDISK: romfs filesystem found at block %d\n", + start_block); + nblocks = (ntohl(romfsb->size)+BLOCK_SIZE-1)>>BLOCK_SIZE_BITS; + goto done; + } + + if (cramfsb->magic == CRAMFS_MAGIC) { + printk(KERN_NOTICE + "RAMDISK: cramfs filesystem found at block %d\n", + start_block); + nblocks = (cramfsb->size + BLOCK_SIZE - 1) >> BLOCK_SIZE_BITS; + goto done; + } + + /* + * Read block 1 to test for minix and ext2 superblock + */ + sys_lseek(fd, (start_block+1) * BLOCK_SIZE, 0); + sys_read(fd, buf, size); + + /* Try minix */ + if (minixsb->s_magic == MINIX_SUPER_MAGIC || + minixsb->s_magic == MINIX_SUPER_MAGIC2) { + printk(KERN_NOTICE + "RAMDISK: Minix filesystem found at block %d\n", + start_block); + nblocks = minixsb->s_nzones << minixsb->s_log_zone_size; + goto done; + } + + /* Try ext2 */ + if (ext2sb->s_magic == cpu_to_le16(EXT2_SUPER_MAGIC)) { + printk(KERN_NOTICE + "RAMDISK: ext2 filesystem found at block %d\n", + start_block); + nblocks = le32_to_cpu(ext2sb->s_blocks_count) << + le32_to_cpu(ext2sb->s_log_block_size); + goto done; + } + + printk(KERN_NOTICE + "RAMDISK: Couldn't find valid RAM disk image starting at %d.\n", + start_block); + +done: + sys_lseek(fd, start_block * BLOCK_SIZE, 0); + kfree(buf); + return nblocks; +} + +int __init rd_load_image(char *from) +{ + int res = 0; + int in_fd, out_fd; + unsigned long rd_blocks, devblocks; + int nblocks, i, disk; + char *buf = NULL; + unsigned short rotate = 0; +#if !defined(CONFIG_ARCH_S390) && !defined(CONFIG_PPC_ISERIES) + char rotator[4] = { '|' , '/' , '-' , '\\' }; +#endif + + out_fd = sys_open("/dev/ram", O_RDWR, 0); + if (out_fd < 0) + goto out; + + in_fd = sys_open(from, O_RDONLY, 0); + if (in_fd < 0) + goto noclose_input; + + nblocks = identify_ramdisk_image(in_fd, rd_image_start); + if (nblocks < 0) + goto done; + + if (nblocks == 0) { +#ifdef BUILD_CRAMDISK + if (crd_load(in_fd, out_fd) == 0) + goto successful_load; +#else + printk(KERN_NOTICE + "RAMDISK: Kernel does not support compressed " + "RAM disk images\n"); +#endif + goto done; + } + + /* + * NOTE NOTE: nblocks is not actually blocks but + * the number of kibibytes of data to load into a ramdisk. + * So any ramdisk block size that is a multiple of 1KiB should + * work when the appropriate ramdisk_blocksize is specified + * on the command line. + * + * The default ramdisk_blocksize is 1KiB and it is generally + * silly to use anything else, so make sure to use 1KiB + * blocksize while generating ext2fs ramdisk-images. + */ + if (sys_ioctl(out_fd, BLKGETSIZE, (unsigned long)&rd_blocks) < 0) + rd_blocks = 0; + else + rd_blocks >>= 1; + + if (nblocks > rd_blocks) { + printk("RAMDISK: image too big! (%dKiB/%ldKiB)\n", + nblocks, rd_blocks); + goto done; + } + + /* + * OK, time to copy in the data + */ + if (sys_ioctl(in_fd, BLKGETSIZE, (unsigned long)&devblocks) < 0) + devblocks = 0; + else + devblocks >>= 1; + + if (strcmp(from, "/initrd.image") == 0) + devblocks = nblocks; + + if (devblocks == 0) { + printk(KERN_ERR "RAMDISK: could not determine device size\n"); + goto done; + } + + buf = kmalloc(BLOCK_SIZE, GFP_KERNEL); + if (buf == 0) { + printk(KERN_ERR "RAMDISK: could not allocate buffer\n"); + goto done; + } + + printk(KERN_NOTICE "RAMDISK: Loading %dKiB [%ld disk%s] into ram disk... ", + nblocks, ((nblocks-1)/devblocks)+1, nblocks>devblocks ? "s" : ""); + for (i = 0, disk = 1; i < nblocks; i++) { + if (i && (i % devblocks == 0)) { + printk("done disk #%d.\n", disk++); + rotate = 0; + if (sys_close(in_fd)) { + printk("Error closing the disk.\n"); + goto noclose_input; + } + change_floppy("disk #%d", disk); + in_fd = sys_open(from, O_RDONLY, 0); + if (in_fd < 0) { + printk("Error opening disk.\n"); + goto noclose_input; + } + printk("Loading disk #%d... ", disk); + } + sys_read(in_fd, buf, BLOCK_SIZE); + sys_write(out_fd, buf, BLOCK_SIZE); +#if !defined(CONFIG_ARCH_S390) && !defined(CONFIG_PPC_ISERIES) + if (!(i % 16)) { + printk("%c\b", rotator[rotate & 0x3]); + rotate++; + } +#endif + } + printk("done.\n"); + +successful_load: + res = 1; +done: + sys_close(in_fd); +noclose_input: + sys_close(out_fd); +out: + kfree(buf); + sys_unlink("/dev/ram"); + return res; +} + +int __init rd_load_disk(int n) +{ + if (rd_prompt) + change_floppy("root floppy disk to be loaded into RAM disk"); + create_dev("/dev/root", ROOT_DEV, root_device_name); + create_dev("/dev/ram", MKDEV(RAMDISK_MAJOR, n), NULL); + return rd_load_image("/dev/root"); +} + +#ifdef BUILD_CRAMDISK + +/* + * gzip declarations + */ + +#define OF(args) args + +#ifndef memzero +#define memzero(s, n) memset ((s), 0, (n)) +#endif + +typedef unsigned char uch; +typedef unsigned short ush; +typedef unsigned long ulg; + +#define INBUFSIZ 4096 +#define WSIZE 0x8000 /* window size--must be a power of two, and */ + /* at least 32K for zip's deflate method */ + +static uch *inbuf; +static uch *window; + +static unsigned insize; /* valid bytes in inbuf */ +static unsigned inptr; /* index of next byte to be processed in inbuf */ +static unsigned outcnt; /* bytes in output buffer */ +static int exit_code; +static int unzip_error; +static long bytes_out; +static int crd_infd, crd_outfd; + +#define get_byte() (inptr < insize ? inbuf[inptr++] : fill_inbuf()) + +/* Diagnostic functions (stubbed out) */ +#define Assert(cond,msg) +#define Trace(x) +#define Tracev(x) +#define Tracevv(x) +#define Tracec(c,x) +#define Tracecv(c,x) + +#define STATIC static +#define INIT __init + +static int __init fill_inbuf(void); +static void __init flush_window(void); +static void __init *malloc(size_t size); +static void __init free(void *where); +static void __init error(char *m); +static void __init gzip_mark(void **); +static void __init gzip_release(void **); + +#include "../lib/inflate.c" + +static void __init *malloc(size_t size) +{ + return kmalloc(size, GFP_KERNEL); +} + +static void __init free(void *where) +{ + kfree(where); +} + +static void __init gzip_mark(void **ptr) +{ +} + +static void __init gzip_release(void **ptr) +{ +} + + +/* =========================================================================== + * Fill the input buffer. This is called only when the buffer is empty + * and at least one byte is really needed. + * Returning -1 does not guarantee that gunzip() will ever return. + */ +static int __init fill_inbuf(void) +{ + if (exit_code) return -1; + + insize = sys_read(crd_infd, inbuf, INBUFSIZ); + if (insize == 0) { + error("RAMDISK: ran out of compressed data"); + return -1; + } + + inptr = 1; + + return inbuf[0]; +} + +/* =========================================================================== + * Write the output window window[0..outcnt-1] and update crc and bytes_out. + * (Used for the decompressed data only.) + */ +static void __init flush_window(void) +{ + ulg c = crc; /* temporary variable */ + unsigned n, written; + uch *in, ch; + + written = sys_write(crd_outfd, window, outcnt); + if (written != outcnt && unzip_error == 0) { + printk(KERN_ERR "RAMDISK: incomplete write (%d != %d) %ld\n", + written, outcnt, bytes_out); + unzip_error = 1; + } + in = window; + for (n = 0; n < outcnt; n++) { + ch = *in++; + c = crc_32_tab[((int)c ^ ch) & 0xff] ^ (c >> 8); + } + crc = c; + bytes_out += (ulg)outcnt; + outcnt = 0; +} + +static void __init error(char *x) +{ + printk(KERN_ERR "%s\n", x); + exit_code = 1; + unzip_error = 1; +} + +static int __init crd_load(int in_fd, int out_fd) +{ + int result; + + insize = 0; /* valid bytes in inbuf */ + inptr = 0; /* index of next byte to be processed in inbuf */ + outcnt = 0; /* bytes in output buffer */ + exit_code = 0; + bytes_out = 0; + crc = (ulg)0xffffffffL; /* shift register contents */ + + crd_infd = in_fd; + crd_outfd = out_fd; + inbuf = kmalloc(INBUFSIZ, GFP_KERNEL); + if (inbuf == 0) { + printk(KERN_ERR "RAMDISK: Couldn't allocate gzip buffer\n"); + return -1; + } + window = kmalloc(WSIZE, GFP_KERNEL); + if (window == 0) { + printk(KERN_ERR "RAMDISK: Couldn't allocate gzip window\n"); + kfree(inbuf); + return -1; + } + makecrc(); + result = gunzip(); + if (unzip_error) + result = 1; + kfree(inbuf); + kfree(window); + return result; +} + +#endif /* BUILD_CRAMDISK */ diff --git a/init/initramfs.c b/init/initramfs.c new file mode 100644 index 000000000000..02c5ce64990d --- /dev/null +++ b/init/initramfs.c @@ -0,0 +1,500 @@ +#include <linux/init.h> +#include <linux/fs.h> +#include <linux/slab.h> +#include <linux/types.h> +#include <linux/fcntl.h> +#include <linux/delay.h> +#include <linux/string.h> +#include <linux/syscalls.h> + +static __initdata char *message; +static void __init error(char *x) +{ + if (!message) + message = x; +} + +static void __init *malloc(size_t size) +{ + return kmalloc(size, GFP_KERNEL); +} + +static void __init free(void *where) +{ + kfree(where); +} + +/* link hash */ + +static __initdata struct hash { + int ino, minor, major; + struct hash *next; + char *name; +} *head[32]; + +static inline int hash(int major, int minor, int ino) +{ + unsigned long tmp = ino + minor + (major << 3); + tmp += tmp >> 5; + return tmp & 31; +} + +static char __init *find_link(int major, int minor, int ino, char *name) +{ + struct hash **p, *q; + for (p = head + hash(major, minor, ino); *p; p = &(*p)->next) { + if ((*p)->ino != ino) + continue; + if ((*p)->minor != minor) + continue; + if ((*p)->major != major) + continue; + return (*p)->name; + } + q = (struct hash *)malloc(sizeof(struct hash)); + if (!q) + panic("can't allocate link hash entry"); + q->ino = ino; + q->minor = minor; + q->major = major; + q->name = name; + q->next = NULL; + *p = q; + return NULL; +} + +static void __init free_hash(void) +{ + struct hash **p, *q; + for (p = head; p < head + 32; p++) { + while (*p) { + q = *p; + *p = q->next; + free(q); + } + } +} + +/* cpio header parsing */ + +static __initdata unsigned long ino, major, minor, nlink; +static __initdata mode_t mode; +static __initdata unsigned long body_len, name_len; +static __initdata uid_t uid; +static __initdata gid_t gid; +static __initdata unsigned rdev; + +static void __init parse_header(char *s) +{ + unsigned long parsed[12]; + char buf[9]; + int i; + + buf[8] = '\0'; + for (i = 0, s += 6; i < 12; i++, s += 8) { + memcpy(buf, s, 8); + parsed[i] = simple_strtoul(buf, NULL, 16); + } + ino = parsed[0]; + mode = parsed[1]; + uid = parsed[2]; + gid = parsed[3]; + nlink = parsed[4]; + body_len = parsed[6]; + major = parsed[7]; + minor = parsed[8]; + rdev = new_encode_dev(MKDEV(parsed[9], parsed[10])); + name_len = parsed[11]; +} + +/* FSM */ + +static __initdata enum state { + Start, + Collect, + GotHeader, + SkipIt, + GotName, + CopyFile, + GotSymlink, + Reset +} state, next_state; + +static __initdata char *victim; +static __initdata unsigned count; +static __initdata loff_t this_header, next_header; + +static __initdata int dry_run; + +static inline void eat(unsigned n) +{ + victim += n; + this_header += n; + count -= n; +} + +#define N_ALIGN(len) ((((len) + 1) & ~3) + 2) + +static __initdata char *collected; +static __initdata int remains; +static __initdata char *collect; + +static void __init read_into(char *buf, unsigned size, enum state next) +{ + if (count >= size) { + collected = victim; + eat(size); + state = next; + } else { + collect = collected = buf; + remains = size; + next_state = next; + state = Collect; + } +} + +static __initdata char *header_buf, *symlink_buf, *name_buf; + +static int __init do_start(void) +{ + read_into(header_buf, 110, GotHeader); + return 0; +} + +static int __init do_collect(void) +{ + unsigned n = remains; + if (count < n) + n = count; + memcpy(collect, victim, n); + eat(n); + collect += n; + if ((remains -= n) != 0) + return 1; + state = next_state; + return 0; +} + +static int __init do_header(void) +{ + if (memcmp(collected, "070701", 6)) { + error("no cpio magic"); + return 1; + } + parse_header(collected); + next_header = this_header + N_ALIGN(name_len) + body_len; + next_header = (next_header + 3) & ~3; + if (dry_run) { + read_into(name_buf, N_ALIGN(name_len), GotName); + return 0; + } + state = SkipIt; + if (name_len <= 0 || name_len > PATH_MAX) + return 0; + if (S_ISLNK(mode)) { + if (body_len > PATH_MAX) + return 0; + collect = collected = symlink_buf; + remains = N_ALIGN(name_len) + body_len; + next_state = GotSymlink; + state = Collect; + return 0; + } + if (S_ISREG(mode) || !body_len) + read_into(name_buf, N_ALIGN(name_len), GotName); + return 0; +} + +static int __init do_skip(void) +{ + if (this_header + count < next_header) { + eat(count); + return 1; + } else { + eat(next_header - this_header); + state = next_state; + return 0; + } +} + +static int __init do_reset(void) +{ + while(count && *victim == '\0') + eat(1); + if (count && (this_header & 3)) + error("broken padding"); + return 1; +} + +static int __init maybe_link(void) +{ + if (nlink >= 2) { + char *old = find_link(major, minor, ino, collected); + if (old) + return (sys_link(old, collected) < 0) ? -1 : 1; + } + return 0; +} + +static __initdata int wfd; + +static int __init do_name(void) +{ + state = SkipIt; + next_state = Reset; + if (strcmp(collected, "TRAILER!!!") == 0) { + free_hash(); + return 0; + } + if (dry_run) + return 0; + if (S_ISREG(mode)) { + if (maybe_link() >= 0) { + wfd = sys_open(collected, O_WRONLY|O_CREAT, mode); + if (wfd >= 0) { + sys_fchown(wfd, uid, gid); + sys_fchmod(wfd, mode); + state = CopyFile; + } + } + } else if (S_ISDIR(mode)) { + sys_mkdir(collected, mode); + sys_chown(collected, uid, gid); + sys_chmod(collected, mode); + } else if (S_ISBLK(mode) || S_ISCHR(mode) || + S_ISFIFO(mode) || S_ISSOCK(mode)) { + if (maybe_link() == 0) { + sys_mknod(collected, mode, rdev); + sys_chown(collected, uid, gid); + sys_chmod(collected, mode); + } + } + return 0; +} + +static int __init do_copy(void) +{ + if (count >= body_len) { + sys_write(wfd, victim, body_len); + sys_close(wfd); + eat(body_len); + state = SkipIt; + return 0; + } else { + sys_write(wfd, victim, count); + body_len -= count; + eat(count); + return 1; + } +} + +static int __init do_symlink(void) +{ + collected[N_ALIGN(name_len) + body_len] = '\0'; + sys_symlink(collected + N_ALIGN(name_len), collected); + sys_lchown(collected, uid, gid); + state = SkipIt; + next_state = Reset; + return 0; +} + +static __initdata int (*actions[])(void) = { + [Start] = do_start, + [Collect] = do_collect, + [GotHeader] = do_header, + [SkipIt] = do_skip, + [GotName] = do_name, + [CopyFile] = do_copy, + [GotSymlink] = do_symlink, + [Reset] = do_reset, +}; + +static int __init write_buffer(char *buf, unsigned len) +{ + count = len; + victim = buf; + + while (!actions[state]()) + ; + return len - count; +} + +static void __init flush_buffer(char *buf, unsigned len) +{ + int written; + if (message) + return; + while ((written = write_buffer(buf, len)) < len && !message) { + char c = buf[written]; + if (c == '0') { + buf += written; + len -= written; + state = Start; + } else if (c == 0) { + buf += written; + len -= written; + state = Reset; + } else + error("junk in compressed archive"); + } +} + +/* + * gzip declarations + */ + +#define OF(args) args + +#ifndef memzero +#define memzero(s, n) memset ((s), 0, (n)) +#endif + +typedef unsigned char uch; +typedef unsigned short ush; +typedef unsigned long ulg; + +#define WSIZE 0x8000 /* window size--must be a power of two, and */ + /* at least 32K for zip's deflate method */ + +static uch *inbuf; +static uch *window; + +static unsigned insize; /* valid bytes in inbuf */ +static unsigned inptr; /* index of next byte to be processed in inbuf */ +static unsigned outcnt; /* bytes in output buffer */ +static long bytes_out; + +#define get_byte() (inptr < insize ? inbuf[inptr++] : -1) + +/* Diagnostic functions (stubbed out) */ +#define Assert(cond,msg) +#define Trace(x) +#define Tracev(x) +#define Tracevv(x) +#define Tracec(c,x) +#define Tracecv(c,x) + +#define STATIC static +#define INIT __init + +static void __init flush_window(void); +static void __init error(char *m); +static void __init gzip_mark(void **); +static void __init gzip_release(void **); + +#include "../lib/inflate.c" + +static void __init gzip_mark(void **ptr) +{ +} + +static void __init gzip_release(void **ptr) +{ +} + +/* =========================================================================== + * Write the output window window[0..outcnt-1] and update crc and bytes_out. + * (Used for the decompressed data only.) + */ +static void __init flush_window(void) +{ + ulg c = crc; /* temporary variable */ + unsigned n; + uch *in, ch; + + flush_buffer(window, outcnt); + in = window; + for (n = 0; n < outcnt; n++) { + ch = *in++; + c = crc_32_tab[((int)c ^ ch) & 0xff] ^ (c >> 8); + } + crc = c; + bytes_out += (ulg)outcnt; + outcnt = 0; +} + +static char * __init unpack_to_rootfs(char *buf, unsigned len, int check_only) +{ + int written; + dry_run = check_only; + header_buf = malloc(110); + symlink_buf = malloc(PATH_MAX + N_ALIGN(PATH_MAX) + 1); + name_buf = malloc(N_ALIGN(PATH_MAX)); + window = malloc(WSIZE); + if (!window || !header_buf || !symlink_buf || !name_buf) + panic("can't allocate buffers"); + state = Start; + this_header = 0; + message = NULL; + while (!message && len) { + loff_t saved_offset = this_header; + if (*buf == '0' && !(this_header & 3)) { + state = Start; + written = write_buffer(buf, len); + buf += written; + len -= written; + continue; + } + if (!*buf) { + buf++; + len--; + this_header++; + continue; + } + this_header = 0; + insize = len; + inbuf = buf; + inptr = 0; + outcnt = 0; /* bytes in output buffer */ + bytes_out = 0; + crc = (ulg)0xffffffffL; /* shift register contents */ + makecrc(); + gunzip(); + if (state != Reset) + error("junk in gzipped archive"); + this_header = saved_offset + inptr; + buf += inptr; + len -= inptr; + } + free(window); + free(name_buf); + free(symlink_buf); + free(header_buf); + return message; +} + +extern char __initramfs_start[], __initramfs_end[]; +#ifdef CONFIG_BLK_DEV_INITRD +#include <linux/initrd.h> +#endif + +void __init populate_rootfs(void) +{ + char *err = unpack_to_rootfs(__initramfs_start, + __initramfs_end - __initramfs_start, 0); + if (err) + panic(err); +#ifdef CONFIG_BLK_DEV_INITRD + if (initrd_start) { + int fd; + printk(KERN_INFO "checking if image is initramfs..."); + err = unpack_to_rootfs((char *)initrd_start, + initrd_end - initrd_start, 1); + if (!err) { + printk(" it is\n"); + unpack_to_rootfs((char *)initrd_start, + initrd_end - initrd_start, 0); + free_initrd_mem(initrd_start, initrd_end); + return; + } + printk("it isn't (%s); looks like an initrd\n", err); + fd = sys_open("/initrd.image", O_WRONLY|O_CREAT, 700); + if (fd >= 0) { + sys_write(fd, (char *)initrd_start, + initrd_end - initrd_start); + sys_close(fd); + free_initrd_mem(initrd_start, initrd_end); + } + } +#endif +} diff --git a/init/main.c b/init/main.c new file mode 100644 index 000000000000..40bf367ffdf1 --- /dev/null +++ b/init/main.c @@ -0,0 +1,713 @@ +/* + * linux/init/main.c + * + * Copyright (C) 1991, 1992 Linus Torvalds + * + * GK 2/5/95 - Changed to support mounting root fs via NFS + * Added initrd & change_root: Werner Almesberger & Hans Lermen, Feb '96 + * Moan early if gcc is old, avoiding bogus kernels - Paul Gortmaker, May '96 + * Simplified starting of init: Michael A. Griffith <grif@acm.org> + */ + +#define __KERNEL_SYSCALLS__ + +#include <linux/config.h> +#include <linux/types.h> +#include <linux/module.h> +#include <linux/proc_fs.h> +#include <linux/devfs_fs_kernel.h> +#include <linux/kernel.h> +#include <linux/syscalls.h> +#include <linux/string.h> +#include <linux/ctype.h> +#include <linux/delay.h> +#include <linux/utsname.h> +#include <linux/ioport.h> +#include <linux/init.h> +#include <linux/smp_lock.h> +#include <linux/initrd.h> +#include <linux/hdreg.h> +#include <linux/bootmem.h> +#include <linux/tty.h> +#include <linux/gfp.h> +#include <linux/percpu.h> +#include <linux/kmod.h> +#include <linux/kernel_stat.h> +#include <linux/security.h> +#include <linux/workqueue.h> +#include <linux/profile.h> +#include <linux/rcupdate.h> +#include <linux/moduleparam.h> +#include <linux/kallsyms.h> +#include <linux/writeback.h> +#include <linux/cpu.h> +#include <linux/cpuset.h> +#include <linux/efi.h> +#include <linux/unistd.h> +#include <linux/rmap.h> +#include <linux/mempolicy.h> +#include <linux/key.h> + +#include <asm/io.h> +#include <asm/bugs.h> +#include <asm/setup.h> + +/* + * This is one of the first .c files built. Error out early + * if we have compiler trouble.. + */ +#if __GNUC__ == 2 && __GNUC_MINOR__ == 96 +#ifdef CONFIG_FRAME_POINTER +#error This compiler cannot compile correctly with frame pointers enabled +#endif +#endif + +#ifdef CONFIG_X86_LOCAL_APIC +#include <asm/smp.h> +#endif + +/* + * Versions of gcc older than that listed below may actually compile + * and link okay, but the end product can have subtle run time bugs. + * To avoid associated bogus bug reports, we flatly refuse to compile + * with a gcc that is known to be too old from the very beginning. + */ +#if __GNUC__ < 2 || (__GNUC__ == 2 && __GNUC_MINOR__ < 95) +#error Sorry, your GCC is too old. It builds incorrect kernels. +#endif + +static int init(void *); + +extern void init_IRQ(void); +extern void sock_init(void); +extern void fork_init(unsigned long); +extern void mca_init(void); +extern void sbus_init(void); +extern void sysctl_init(void); +extern void signals_init(void); +extern void buffer_init(void); +extern void pidhash_init(void); +extern void pidmap_init(void); +extern void prio_tree_init(void); +extern void radix_tree_init(void); +extern void free_initmem(void); +extern void populate_rootfs(void); +extern void driver_init(void); +extern void prepare_namespace(void); +#ifdef CONFIG_ACPI +extern void acpi_early_init(void); +#else +static inline void acpi_early_init(void) { } +#endif + +#ifdef CONFIG_TC +extern void tc_init(void); +#endif + +enum system_states system_state; +EXPORT_SYMBOL(system_state); + +/* + * Boot command-line arguments + */ +#define MAX_INIT_ARGS CONFIG_INIT_ENV_ARG_LIMIT +#define MAX_INIT_ENVS CONFIG_INIT_ENV_ARG_LIMIT + +extern void time_init(void); +/* Default late time init is NULL. archs can override this later. */ +void (*late_time_init)(void); +extern void softirq_init(void); + +/* Untouched command line (eg. for /proc) saved by arch-specific code. */ +char saved_command_line[COMMAND_LINE_SIZE]; + +static char *execute_command; + +/* Setup configured maximum number of CPUs to activate */ +static unsigned int max_cpus = NR_CPUS; + +/* + * Setup routine for controlling SMP activation + * + * Command-line option of "nosmp" or "maxcpus=0" will disable SMP + * activation entirely (the MPS table probe still happens, though). + * + * Command-line option of "maxcpus=<NUM>", where <NUM> is an integer + * greater than 0, limits the maximum number of CPUs activated in + * SMP mode to <NUM>. + */ +static int __init nosmp(char *str) +{ + max_cpus = 0; + return 1; +} + +__setup("nosmp", nosmp); + +static int __init maxcpus(char *str) +{ + get_option(&str, &max_cpus); + return 1; +} + +__setup("maxcpus=", maxcpus); + +static char * argv_init[MAX_INIT_ARGS+2] = { "init", NULL, }; +char * envp_init[MAX_INIT_ENVS+2] = { "HOME=/", "TERM=linux", NULL, }; +static const char *panic_later, *panic_param; + +extern struct obs_kernel_param __setup_start[], __setup_end[]; + +static int __init obsolete_checksetup(char *line) +{ + struct obs_kernel_param *p; + + p = __setup_start; + do { + int n = strlen(p->str); + if (!strncmp(line, p->str, n)) { + if (p->early) { + /* Already done in parse_early_param? (Needs + * exact match on param part) */ + if (line[n] == '\0' || line[n] == '=') + return 1; + } else if (!p->setup_func) { + printk(KERN_WARNING "Parameter %s is obsolete," + " ignored\n", p->str); + return 1; + } else if (p->setup_func(line + n)) + return 1; + } + p++; + } while (p < __setup_end); + return 0; +} + +/* + * This should be approx 2 Bo*oMips to start (note initial shift), and will + * still work even if initially too large, it will just take slightly longer + */ +unsigned long loops_per_jiffy = (1<<12); + +EXPORT_SYMBOL(loops_per_jiffy); + +static int __init debug_kernel(char *str) +{ + if (*str) + return 0; + console_loglevel = 10; + return 1; +} + +static int __init quiet_kernel(char *str) +{ + if (*str) + return 0; + console_loglevel = 4; + return 1; +} + +__setup("debug", debug_kernel); +__setup("quiet", quiet_kernel); + +static int __init loglevel(char *str) +{ + get_option(&str, &console_loglevel); + return 1; +} + +__setup("loglevel=", loglevel); + +/* + * Unknown boot options get handed to init, unless they look like + * failed parameters + */ +static int __init unknown_bootoption(char *param, char *val) +{ + /* Change NUL term back to "=", to make "param" the whole string. */ + if (val) { + /* param=val or param="val"? */ + if (val == param+strlen(param)+1) + val[-1] = '='; + else if (val == param+strlen(param)+2) { + val[-2] = '='; + memmove(val-1, val, strlen(val)+1); + val--; + } else + BUG(); + } + + /* Handle obsolete-style parameters */ + if (obsolete_checksetup(param)) + return 0; + + /* + * Preemptive maintenance for "why didn't my mispelled command + * line work?" + */ + if (strchr(param, '.') && (!val || strchr(param, '.') < val)) { + printk(KERN_ERR "Unknown boot option `%s': ignoring\n", param); + return 0; + } + + if (panic_later) + return 0; + + if (val) { + /* Environment option */ + unsigned int i; + for (i = 0; envp_init[i]; i++) { + if (i == MAX_INIT_ENVS) { + panic_later = "Too many boot env vars at `%s'"; + panic_param = param; + } + if (!strncmp(param, envp_init[i], val - param)) + break; + } + envp_init[i] = param; + } else { + /* Command line option */ + unsigned int i; + for (i = 0; argv_init[i]; i++) { + if (i == MAX_INIT_ARGS) { + panic_later = "Too many boot init vars at `%s'"; + panic_param = param; + } + } + argv_init[i] = param; + } + return 0; +} + +static int __init init_setup(char *str) +{ + unsigned int i; + + execute_command = str; + /* + * In case LILO is going to boot us with default command line, + * it prepends "auto" before the whole cmdline which makes + * the shell think it should execute a script with such name. + * So we ignore all arguments entered _before_ init=... [MJ] + */ + for (i = 1; i < MAX_INIT_ARGS; i++) + argv_init[i] = NULL; + return 1; +} +__setup("init=", init_setup); + +extern void setup_arch(char **); + +#ifndef CONFIG_SMP + +#ifdef CONFIG_X86_LOCAL_APIC +static void __init smp_init(void) +{ + APIC_init_uniprocessor(); +} +#else +#define smp_init() do { } while (0) +#endif + +static inline void setup_per_cpu_areas(void) { } +static inline void smp_prepare_cpus(unsigned int maxcpus) { } + +#else + +#ifdef __GENERIC_PER_CPU +unsigned long __per_cpu_offset[NR_CPUS]; + +EXPORT_SYMBOL(__per_cpu_offset); + +static void __init setup_per_cpu_areas(void) +{ + unsigned long size, i; + char *ptr; + /* Created by linker magic */ + extern char __per_cpu_start[], __per_cpu_end[]; + + /* Copy section for each CPU (we discard the original) */ + size = ALIGN(__per_cpu_end - __per_cpu_start, SMP_CACHE_BYTES); +#ifdef CONFIG_MODULES + if (size < PERCPU_ENOUGH_ROOM) + size = PERCPU_ENOUGH_ROOM; +#endif + + ptr = alloc_bootmem(size * NR_CPUS); + + for (i = 0; i < NR_CPUS; i++, ptr += size) { + __per_cpu_offset[i] = ptr - __per_cpu_start; + memcpy(ptr, __per_cpu_start, __per_cpu_end - __per_cpu_start); + } +} +#endif /* !__GENERIC_PER_CPU */ + +/* Called by boot processor to activate the rest. */ +static void __init smp_init(void) +{ + unsigned int i; + + /* FIXME: This should be done in userspace --RR */ + for_each_present_cpu(i) { + if (num_online_cpus() >= max_cpus) + break; + if (!cpu_online(i)) + cpu_up(i); + } + + /* Any cleanup work */ + printk(KERN_INFO "Brought up %ld CPUs\n", (long)num_online_cpus()); + smp_cpus_done(max_cpus); +#if 0 + /* Get other processors into their bootup holding patterns. */ + + smp_commence(); +#endif +} + +#endif + +/* + * We need to finalize in a non-__init function or else race conditions + * between the root thread and the init thread may cause start_kernel to + * be reaped by free_initmem before the root thread has proceeded to + * cpu_idle. + * + * gcc-3.4 accidentally inlines this function, so use noinline. + */ + +static void noinline rest_init(void) + __releases(kernel_lock) +{ + kernel_thread(init, NULL, CLONE_FS | CLONE_SIGHAND); + numa_default_policy(); + unlock_kernel(); + preempt_enable_no_resched(); + cpu_idle(); +} + +/* Check for early params. */ +static int __init do_early_param(char *param, char *val) +{ + struct obs_kernel_param *p; + + for (p = __setup_start; p < __setup_end; p++) { + if (p->early && strcmp(param, p->str) == 0) { + if (p->setup_func(val) != 0) + printk(KERN_WARNING + "Malformed early option '%s'\n", param); + } + } + /* We accept everything at this stage. */ + return 0; +} + +/* Arch code calls this early on, or if not, just before other parsing. */ +void __init parse_early_param(void) +{ + static __initdata int done = 0; + static __initdata char tmp_cmdline[COMMAND_LINE_SIZE]; + + if (done) + return; + + /* All fall through to do_early_param. */ + strlcpy(tmp_cmdline, saved_command_line, COMMAND_LINE_SIZE); + parse_args("early options", tmp_cmdline, NULL, 0, do_early_param); + done = 1; +} + +/* + * Activate the first processor. + */ + +asmlinkage void __init start_kernel(void) +{ + char * command_line; + extern struct kernel_param __start___param[], __stop___param[]; +/* + * Interrupts are still disabled. Do necessary setups, then + * enable them + */ + lock_kernel(); + page_address_init(); + printk(KERN_NOTICE); + printk(linux_banner); + setup_arch(&command_line); + setup_per_cpu_areas(); + + /* + * Mark the boot cpu "online" so that it can call console drivers in + * printk() and can access its per-cpu storage. + */ + smp_prepare_boot_cpu(); + + /* + * Set up the scheduler prior starting any interrupts (such as the + * timer interrupt). Full topology setup happens at smp_init() + * time - but meanwhile we still have a functioning scheduler. + */ + sched_init(); + /* + * Disable preemption - early bootup scheduling is extremely + * fragile until we cpu_idle() for the first time. + */ + preempt_disable(); + build_all_zonelists(); + page_alloc_init(); + printk(KERN_NOTICE "Kernel command line: %s\n", saved_command_line); + parse_early_param(); + parse_args("Booting kernel", command_line, __start___param, + __stop___param - __start___param, + &unknown_bootoption); + sort_main_extable(); + trap_init(); + rcu_init(); + init_IRQ(); + pidhash_init(); + init_timers(); + softirq_init(); + time_init(); + + /* + * HACK ALERT! This is early. We're enabling the console before + * we've done PCI setups etc, and console_init() must be aware of + * this. But we do want output early, in case something goes wrong. + */ + console_init(); + if (panic_later) + panic(panic_later, panic_param); + profile_init(); + local_irq_enable(); +#ifdef CONFIG_BLK_DEV_INITRD + if (initrd_start && !initrd_below_start_ok && + initrd_start < min_low_pfn << PAGE_SHIFT) { + printk(KERN_CRIT "initrd overwritten (0x%08lx < 0x%08lx) - " + "disabling it.\n",initrd_start,min_low_pfn << PAGE_SHIFT); + initrd_start = 0; + } +#endif + vfs_caches_init_early(); + mem_init(); + kmem_cache_init(); + numa_policy_init(); + if (late_time_init) + late_time_init(); + calibrate_delay(); + pidmap_init(); + pgtable_cache_init(); + prio_tree_init(); + anon_vma_init(); +#ifdef CONFIG_X86 + if (efi_enabled) + efi_enter_virtual_mode(); +#endif + fork_init(num_physpages); + proc_caches_init(); + buffer_init(); + unnamed_dev_init(); + key_init(); + security_init(); + vfs_caches_init(num_physpages); + radix_tree_init(); + signals_init(); + /* rootfs populating might need page-writeback */ + page_writeback_init(); +#ifdef CONFIG_PROC_FS + proc_root_init(); +#endif + cpuset_init(); + + check_bugs(); + + acpi_early_init(); /* before LAPIC and SMP init */ + + /* Do the rest non-__init'ed, we're now alive */ + rest_init(); +} + +static int __initdata initcall_debug; + +static int __init initcall_debug_setup(char *str) +{ + initcall_debug = 1; + return 1; +} +__setup("initcall_debug", initcall_debug_setup); + +struct task_struct *child_reaper = &init_task; + +extern initcall_t __initcall_start[], __initcall_end[]; + +static void __init do_initcalls(void) +{ + initcall_t *call; + int count = preempt_count(); + + for (call = __initcall_start; call < __initcall_end; call++) { + char *msg; + + if (initcall_debug) { + printk(KERN_DEBUG "Calling initcall 0x%p", *call); + print_fn_descriptor_symbol(": %s()", (unsigned long) *call); + printk("\n"); + } + + (*call)(); + + msg = NULL; + if (preempt_count() != count) { + msg = "preemption imbalance"; + preempt_count() = count; + } + if (irqs_disabled()) { + msg = "disabled interrupts"; + local_irq_enable(); + } + if (msg) { + printk(KERN_WARNING "error in initcall at 0x%p: " + "returned with %s\n", *call, msg); + } + } + + /* Make sure there is no pending stuff from the initcall sequence */ + flush_scheduled_work(); +} + +/* + * Ok, the machine is now initialized. None of the devices + * have been touched yet, but the CPU subsystem is up and + * running, and memory and process management works. + * + * Now we can finally start doing some real work.. + */ +static void __init do_basic_setup(void) +{ + /* drivers will send hotplug events */ + init_workqueues(); + usermodehelper_init(); + driver_init(); + +#ifdef CONFIG_SYSCTL + sysctl_init(); +#endif + + /* Networking initialization needs a process context */ + sock_init(); + + do_initcalls(); +} + +static void do_pre_smp_initcalls(void) +{ + extern int spawn_ksoftirqd(void); +#ifdef CONFIG_SMP + extern int migration_init(void); + + migration_init(); +#endif + spawn_ksoftirqd(); +} + +static void run_init_process(char *init_filename) +{ + argv_init[0] = init_filename; + execve(init_filename, argv_init, envp_init); +} + +static inline void fixup_cpu_present_map(void) +{ +#ifdef CONFIG_SMP + int i; + + /* + * If arch is not hotplug ready and did not populate + * cpu_present_map, just make cpu_present_map same as cpu_possible_map + * for other cpu bringup code to function as normal. e.g smp_init() etc. + */ + if (cpus_empty(cpu_present_map)) { + for_each_cpu(i) { + cpu_set(i, cpu_present_map); + } + } +#endif +} + +static int init(void * unused) +{ + lock_kernel(); + /* + * init can run on any cpu. + */ + set_cpus_allowed(current, CPU_MASK_ALL); + /* + * Tell the world that we're going to be the grim + * reaper of innocent orphaned children. + * + * We don't want people to have to make incorrect + * assumptions about where in the task array this + * can be found. + */ + child_reaper = current; + + /* Sets up cpus_possible() */ + smp_prepare_cpus(max_cpus); + + do_pre_smp_initcalls(); + + fixup_cpu_present_map(); + smp_init(); + sched_init_smp(); + + cpuset_init_smp(); + + /* + * Do this before initcalls, because some drivers want to access + * firmware files. + */ + populate_rootfs(); + + do_basic_setup(); + + /* + * check if there is an early userspace init. If yes, let it do all + * the work + */ + if (sys_access((const char __user *) "/init", 0) == 0) + execute_command = "/init"; + else + prepare_namespace(); + + /* + * Ok, we have completed the initial bootup, and + * we're essentially up and running. Get rid of the + * initmem segments and start the user-mode stuff.. + */ + free_initmem(); + unlock_kernel(); + system_state = SYSTEM_RUNNING; + numa_default_policy(); + + if (sys_open((const char __user *) "/dev/console", O_RDWR, 0) < 0) + printk(KERN_WARNING "Warning: unable to open an initial console.\n"); + + (void) sys_dup(0); + (void) sys_dup(0); + + /* + * We try each of these until one succeeds. + * + * The Bourne shell can be used instead of init if we are + * trying to recover a really broken machine. + */ + + if (execute_command) + run_init_process(execute_command); + + run_init_process("/sbin/init"); + run_init_process("/etc/init"); + run_init_process("/bin/init"); + run_init_process("/bin/sh"); + + panic("No init found. Try passing init= option to kernel."); +} diff --git a/init/version.c b/init/version.c new file mode 100644 index 000000000000..3ddc3ceec2fe --- /dev/null +++ b/init/version.c @@ -0,0 +1,33 @@ +/* + * linux/init/version.c + * + * Copyright (C) 1992 Theodore Ts'o + * + * May be freely distributed as part of Linux. + */ + +#include <linux/compile.h> +#include <linux/module.h> +#include <linux/uts.h> +#include <linux/utsname.h> +#include <linux/version.h> + +#define version(a) Version_ ## a +#define version_string(a) version(a) + +int version_string(LINUX_VERSION_CODE); + +struct new_utsname system_utsname = { + .sysname = UTS_SYSNAME, + .nodename = UTS_NODENAME, + .release = UTS_RELEASE, + .version = UTS_VERSION, + .machine = UTS_MACHINE, + .domainname = UTS_DOMAINNAME, +}; + +EXPORT_SYMBOL(system_utsname); + +const char linux_banner[] = + "Linux version " UTS_RELEASE " (" LINUX_COMPILE_BY "@" + LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION "\n"; |