操作系统基础
操作系统通常被分为内核和用户空间。
内核为软件提供了一个与硬件交互的层。它对硬件进行了抽象,使得许多软件可以在截然不同的硬件上以相同的方式运行。内核提供系统调用,允许用户空间与之交互。内核处理许多内容,包括文件系统(并非总是如此,但通常如此)、设备和进程控制。
用户空间存在于内核之外的所有内容。用户创建的所有进程,包括终端,都存在于用户空间。显示程序的图形用户界面 (GUI) 也位于用户空间。
Unix shell 是一个命令解释器程序,它在命令行环境(例如终端或终端模拟器)中充当用户与操作系统之间的主要接口。Shell 是一个必不可少的(通常是首选的)工具,但它只是一个普通的用户程序,它使用系统调用来完成大部分工作 - 所以它只是一个“外壳”。
现代社会存在着许多 Shell,每个 Shell 都有其自身的特性集。最常见的是 Bourne Shell。Bourne Shell(在 POSIX 位置中俗称为 /bin/sh)已经存在了几十年,并且基本上可以在任何 Unix 计算机上找到。虽然它缺乏某些交互功能,但它非常普遍,因此任何为它编写的脚本都可以在任何 Unix 系统上运行。
Shell 的主要职责是向用户提供命令提示符(例如 $),等待命令,然后执行命令。
Shell 也可用于编写程序,方法是将 Shell 命令写入文本文件。必须在文件顶部包含解释器,形式为 #!interpreter(例如:#!/bin/sh)。执行该文件时,Unix 读取解释器,从而知道使用该 Shell 来解释所有命令。
Shell 的整体结构可以是
repeat forever read one line parse the command into a list of arguments if the line starts with a command name (e.g. cd and exit) then perform the function (if it's exit, break out of the loop) else (it invokes a program, e.g. ls and cat) execute the program
为了读取命令,我们一次读取一行,并将该行标记化为标记。
为了执行程序,需要执行以下操作:使用 fork() 系统调用复制当前进程
/* modified from fork.c in Advanced Linux Programming (page 49) */
/* http://www.makelinux.net/alp/024.htm */
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
int main(){
pid_t child_pid;
printf("the main program process ID is %d\n", (int) getpid());
child_pid = fork();
if(child_pid != 0){
printf("this is the parent process, with id %d\n", (int)getpid());
printf("child_pid=%d\n", child_pid);
}else{
printf("this is the child process, with id %d\n", (int)getpid());
printf("child_pid=%d\n", child_pid);
}
}
此示例显示了如何通过派生当前进程来创建新进程。请注意,fork() 函数调用(系统调用)被调用 **一次**,但返回 **两次**,因为当调用完成时,有两个进程在执行相同的代码。
/* from Advanced Linux Programming (page 51) */
/* http://www.makelinux.net/alp/024.htm */
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
/* Spawn a child process running a new program. PROGRAM is the name
of the program to run; the path will be searched for this program.
ARG_LIST is a NULL-terminated list of character strings to be
passed as the program's argument list. Returns the process ID of
the spawned process. */
int spawn(char* program, char** arg_list) {
pid_t child_pid;
/* Duplicate this process. */
child_pid = fork();
if (child_pid != 0){
/* This is the parent process. */
return child_pid;
}else {
/* Now execute PROGRAM, searching for it in the path. */
execvp(program, arg_list);
/* The execvp function returns only if an error occurs. */
fprintf (stderr, "an error occurred in execvp\n");
abort();
}
}
int main() {
/* The argument list to pass to the "ls" command. */
char* arg_list[] = {
"ls", /* argv[0], the name of the program. */
"-l",
"/",
NULL /* The argument list must end with a NULL. */
};
/* Spawn a child process running the "ls" command. Ignore the
returned child process ID. */
spawn("ls", arg_list);
printf("done with main program\n");
return 0;
}
/* from Advanced Linux Programming (page 51) */
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
/* Spawn a child process running a new program. PROGRAM is the name
of the program to run; the path will be searched for this program.
ARG_LIST is a NULL-terminated list of character strings to be
passed as the program's argument list. Returns the process ID of
the spawned process. */
int spawn(char* program, char** arg_list) {
pid_t child_pid;
/* Duplicate this process. */
child_pid = fork();
if (child_pid != 0){
/* This is the parent process. */
return child_pid;
}else {
/* Now execute PROGRAM, searching for it in the path. */
execvp(program, arg_list);
/* The execvp function returns only if an error occurs. */
fprintf(stderr, "an error occurred in execvp\n");
abort();
}
}
int main() {
/* The argument list to pass to the "ls" command. */
char* arg_list[] = {
"ls", /* argv[0], the name of the program. */
"-l",
"/",
NULL /* The argument list must end with a NULL. */
};
/* Spawn a child process running the "ls" command. Ignore the
returned child process ID. */
pid_t pid = spawn("ls", arg_list);
/* Wait for the child process to complete. */
int child_status;
waitpid(pid, &child_status, 0);
if (WIFEXITED(child_status)){
printf ("the child process exited normally, with exit code %d\n",
WEXITSTATUS(child_status));
}else{
printf("the child process exited abnormally\n");
}
return 0;
}
请注意,参数列表必须以程序名称作为第一个参数,并且 **必须** 以 NULL 结尾,这表示列表的结尾。否则,execvp() 无法正常工作。
#include <stdio.h>
#include <sys/types.h>
#include <fcntl.h>
#include <unistd.h>
int main(){
int overwrite = 0;
int fd;
if(overwrite){
printf("open test.txt to overwrite.\n");
fd = open("test.txt", O_WRONLY | O_CREAT, S_IRWXU);
}else{
printf("open test.txt to append.\n");
fd = open("test.txt", O_WRONLY | O_APPEND | O_CREAT, S_IRWXU);
}
dup2 (fd, STDOUT_FILENO);
printf("hello world!");
}
此示例以写模式打开一个文件(“test.txt”),并将标准输出与打开的文件同义 - 发送到标准输出的字节将转到该文件。“O_WRONLY | O_CREAT” 标记会导致文件以写模式打开,并且如果文件不存在,则创建文件。
有关每个打开文件的信息记录在操作系统管理的表中。每个条目对应于一个打开的文件,每个条目的索引是一个整数(文件描述符),它作为 open() 系统调用的返回值返回给打开文件的进程。标准输入、标准输出和标准错误的条目使用预定义的索引保留:STDIN_FILENO、STDOUT_FILENO 和 STDERR_FILENO(在 <unistd.h> 中定义)。dup2(index1, index2) 函数将复制 index1 处的条目内容到 index2 处的条目,这使得文件描述符 index2 与文件描述符 index1 同义。
“dup2 (fd, STDOUT_FILENO)” 行(在前面的示例中)将当前(主)进程的标准输出重定向到打开的文件。如果您想将子进程的标准输出重定向到文件,则需要等到子进程创建 - 在 fork() 函数调用之后。以下示例演示了该想法。您将看到,来自父进程的“hello”消息仍然会发送到标准输出,但来自子进程的标准输出将被重定向到文件。
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <sys/stat.h>
#include <fcntl.h>
int main() {
/* The argument list to pass to the "ls" command. */
char* arg_list[] = {
"ls", /* argv[0], the name of the program. */
"-l",
"/",
NULL /* The argument list must end with a NULL. */
};
/* Spawn a child process running the "ls" command. */
pid_t child_pid = fork();
if (child_pid == 0){
/* This is the child process. */
char * filename = "test.txt";
int outfile = open(filename, O_CREAT | O_WRONLY, S_IRWXU);
if (outfile == -1){
fprintf(stderr, "Error: failed to create file %s\n", filename);
}else{
/* redirect the standard output from this process to the file. */
if(dup2(outfile, STDOUT_FILENO) != STDOUT_FILENO){
fprintf(stderr, "Error: failed to redirect standard output\n");
}
/* Now execute PROGRAM, searching for it in the path. */
execvp(arg_list[0], arg_list);
/* The execvp function returns only if an error occurs. */
fprintf(stderr, "an error occurred in execvp\n");
abort();
}
}
/* only the parent process executes the following code. */
fprintf(stdout, "Hello from the parent process.\n");
/* Wait for the child process to complete. */
int child_status;
waitpid(child_pid, &child_status, 0);
if (WIFEXITED(child_status)){
printf ("the child process exited normally, with exit code %d\n",
WEXITSTATUS(child_status));
}else{
printf("the child process exited abnormally\n");
}
return 0;
}
/* from Advanced Linux Programming (page 113) */
/* http://www.makelinux.net/alp/038.htm */
#include <stdio.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
int main () {
int fds[2];
pid_t pid;
/* Create a pipe. File descriptors for the two ends of the pipe are
placed in fds. */
pipe(fds);
printf("fds[0]=%d, fds[1]=%d\n", fds[0], fds[1]);
/* Fork a child process. */
pid = fork();
if (pid == (pid_t) 0) {
/* This is the child process. Close our copy of the write end of
the file descriptor. */
close(fds[1]);
/* Connect the read end of the pipe to standard input. */
dup2(fds[0], STDIN_FILENO);
/* Replace the child process with the "sort" program. */
execlp("sort", "sort", NULL);
} else {
/* This is the parent process. */
FILE* stream;
/* Close our copy of the read end of the file descriptor. */
close(fds[0]);
/* Connect the write end of the pipe to standard out, and write
to it. */
dup2(fds[1], STDOUT_FILENO);
printf("This is a test.\n");
printf("Hello, world.\n");
printf("My dog has fleas.\n");
printf("This program is great.\n");
printf("One fish, two fish.\n");
fflush(stdout);
close(fds[1]);
close(STDOUT_FILENO);
/* Wait for the child process to finish. */
waitpid(pid, NULL, 0);
printf("parent process terminated.\n");
}
return 0;
}
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
int run_command(char** arg_list, int rd, int wd) {
pid_t child_pid;
/* Duplicate this process. */
child_pid = fork();
if (child_pid != 0){
/* This is the parent process. */
return child_pid;
}else {
if (rd != STDIN_FILENO){
if(dup2(rd, STDIN_FILENO) != STDIN_FILENO){
fprintf(stderr, "Error: failed to redirect standard input\n");
return -1;
}
}
if (wd != STDOUT_FILENO){
printf("redirect stdout to %d.", wd);
if(dup2(wd, STDOUT_FILENO) != STDOUT_FILENO){
fprintf(stderr, "Error: failed to redirect standard output\n");
return -1;
}
}
/* Now execute PROGRAM, searching for it in the path. */
execvp (arg_list[0], arg_list);
/* The execvp function returns only if an error occurs. */
fprintf(stderr, "an error occurred in execvp\n");
abort();
}
}
int main() {
/* The argument list to pass to the "ls" command. */
char* arg_list[] = {
"ls", /* argv[0], the name of the program. */
"-l",
"|", /* the pipe symbol is at index 2 */
"wc",
"-l",
NULL /* The argument list must end with a NULL. */
};
int pipe_index = 2;
int rd = STDIN_FILENO;
int wd = STDOUT_FILENO;
int fds[2];
if (pipe(fds) != 0) {
fprintf(stderr, "Error: unable to pipe command '%s'\n", arg_list1[0]);
return -1;
}
wd = fds[1]; /*file descriptor for the write end of the pipe*/
// delete the pipe symbol and insert a null to terminate the
// first command's argument list
args[pipe_index] = NULL;
// run first command: read from STDIN and write to the pipe
run_command(arg_list, rd, wd);
close(fds[1]);
rd = fds[0];
wd = STDOUT_FILENO;
// run the second command: read from the pipe and write to STDOUT
// the argument for this command starts at pipe_index+1
run_command(arg_list+pipe_index+1, rd, wd);
fprintf(stderr, "done with main program\n");
return 0;
}
在此示例中,原始参数列表在管道符号处被拆分为两个列表,管道符号被替换为 null 值。这使我们能够使用两个参数列表来运行由管道连接的两个独立命令/程序。
进程是正在执行的程序。它是操作系统管理的实体的隐喻。进程具有自己的地址空间以及操作系统管理数据结构中的其他信息。
输入和输出可能是 Unix 中最普遍的概念。在 Unix 中,一切都是文件,这是有意为之,以便程序可以以通用方式与不同的设备交互。可以将文件作为程序的输入,也可以从程序的输出创建文件。
每个进程都有一组输入和输出流。虽然进程可以拥有开发者想要的任意数量的流,但所有进程至少拥有 3 个流。这些流被称为标准输入、标准输出和标准错误。许多程序使用这些简单的流,以便用户可以轻松地操作它们。例如,程序 cat 和 grep。cat 将给定文件发送到标准输出(通常是您的终端,除非另有说明),grep 在其标准输入中搜索模式。如果输入命令 `$cat file.txt | grep "hi"`,这将搜索文件中给定的文本 "hi"。
文件抽象 - 字节流,含义由文件系统用户指定 文件系统用户 - 最终用户(人类)和直接用户(程序,例如应用程序或 shell) 用户视角 - 一组系统调用,例如 creat、open、close、seek、delete ... 文件属性 - 元数据:所有者、大小、时间戳 ... 目录抽象 - 文件和目录列表,将文件/目录的名称映射到找到数据所需的信息。绝对路径和相对路径 目录操作 - 创建、删除、打开、关闭、重命名、链接、取消链接
布局:分区、引导块、超级块、... 磁盘块分配
contiguous allocation linked list allocation: the first word in each block is used as a pointer to the next one. file allocation table: linked list allocation using a table in memory. i-node (index-node): an i-node is in memory when the corresponding file is open. With a i-node design we can calculate the largest possible file size.
目录实现:i-节点、长文件名 文件共享:符号链接与硬链接 磁盘块大小:权衡和折衷,浪费磁盘空间与性能(数据速率) 跟踪空闲块:链接列表与位图 系统备份:缓存:在路径下查找文件的步骤