操作系统基础

什么是操作系统

操作系统通常被分为内核和用户空间。

内核为软件提供了一个与硬件交互的层。它对硬件进行了抽象，使得许多软件可以在截然不同的硬件上以相同的方式运行。内核提供系统调用，允许用户空间与之交互。内核处理许多内容，包括文件系统（并非总是如此，但通常如此）、设备和进程控制。

用户空间存在于内核之外的所有内容。用户创建的所有进程，包括终端，都存在于用户空间。显示程序的图形用户界面 (GUI) 也位于用户空间。

Unix Shell

Unix shell 是一个命令解释器程序，它在命令行环境（例如终端或终端模拟器）中充当用户与操作系统之间的主要接口。Shell 是一个必不可少的（通常是首选的）工具，但它只是一个普通的用户程序，它使用系统调用来完成大部分工作 - 所以它只是一个“外壳”。

流行的 Shell

现代社会存在着许多 Shell，每个 Shell 都有其自身的特性集。最常见的是 Bourne Shell。Bourne Shell（在 POSIX 位置中俗称为 /bin/sh）已经存在了几十年，并且基本上可以在任何 Unix 计算机上找到。虽然它缺乏某些交互功能，但它非常普遍，因此任何为它编写的脚本都可以在任何 Unix 系统上运行。

Shell 的功能

Shell 的主要职责是向用户提供命令提示符（例如 $），等待命令，然后执行命令。

Shell 也可用于编写程序，方法是将 Shell 命令写入文本文件。必须在文件顶部包含解释器，形式为 #!interpreter（例如：#!/bin/sh）。执行该文件时，Unix 读取解释器，从而知道使用该 Shell 来解释所有命令。

创建 Shell

Shell 的整体结构可以是

 repeat forever
   read one line 
   parse the command into a list of arguments
   if the line starts with a command name (e.g. cd and exit)
   then 
     perform the function (if it's exit, break out of the loop)
   else (it invokes a program, e.g. ls and cat) 
     execute the program

为了读取命令，我们一次读取一行，并将该行标记化为标记。

为了执行程序，需要执行以下操作：使用 fork() 系统调用复制当前进程

使用 fork() 克隆进程

/* modified from fork.c in Advanced Linux Programming (page 49) */
/* http://www.makelinux.net/alp/024.htm */

#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

int main(){
  pid_t child_pid;
  
  printf("the main program process ID is %d\n", (int) getpid());

  child_pid = fork();
  if(child_pid != 0){
    printf("this is the parent process, with id %d\n", (int)getpid());
    printf("child_pid=%d\n", child_pid);
  }else{
    printf("this is the child  process, with id %d\n", (int)getpid());
    printf("child_pid=%d\n", child_pid);
  }
}

此示例显示了如何通过派生当前进程来创建新进程。请注意，fork() 函数调用（系统调用）被调用 **一次**，但返回 **两次**，因为当调用完成时，有两个进程在执行相同的代码。

使用 execvp() 在后台运行新程序

/* from Advanced Linux Programming (page 51) */
/* http://www.makelinux.net/alp/024.htm */
#include <stdio.h> 
#include <stdlib.h> 
#include <sys/types.h> 
#include <unistd.h> 

/* Spawn a child process running a new program. PROGRAM is the name 
   of the program to run; the path will be searched for this program. 
   ARG_LIST is a NULL-terminated list of character strings to be 
   passed as the program's argument list. Returns  the process ID of 
   the spawned process.  */ 

int spawn(char* program, char** arg_list) {
  pid_t child_pid; 

  /* Duplicate this process. */ 
  child_pid = fork(); 

  if (child_pid != 0){
    /* This is the parent process. */ 
    return child_pid; 
  }else {
    /* Now execute PROGRAM, searching for it in the path.  */ 
    execvp(program, arg_list); 
    /* The execvp  function returns only if an error occurs.  */ 
    fprintf (stderr, "an error occurred in execvp\n"); 
    abort(); 
  } 
} 

int main() {
  /*  The argument list to pass to the "ls" command.  */ 
  char* arg_list[] = {
    "ls",     /* argv[0], the name of the program.  */ 
    "-l", 
    "/", 
    NULL      /* The argument list must end with a NULL.  */ 
  }; 

  /* Spawn a child process running the "ls" command. Ignore the 
     returned child process ID.  */ 
  spawn("ls", arg_list); 
  printf("done with main program\n"); 
  return 0; 
}

使用 execvp() 在前台运行新程序

/* from Advanced Linux Programming (page 51) */
#include <stdio.h> 
#include <stdlib.h> 
#include <sys/types.h> 
#include <sys/wait.h>
#include <unistd.h> 

/* Spawn a child process running a new program. PROGRAM is the name 
   of the program to run; the path will be searched for this program. 
   ARG_LIST is a NULL-terminated list of character strings to be 
   passed as the program's argument list. Returns  the process ID of 
   the spawned process.  */ 

int spawn(char* program, char** arg_list) {
  pid_t child_pid; 

  /* Duplicate this process. */ 
  child_pid = fork(); 

  if (child_pid != 0){
    /* This is the parent process. */ 
    return child_pid; 
  }else {
    /* Now execute PROGRAM, searching for it in the path.  */ 
    execvp(program, arg_list); 
    /* The execvp  function returns only if an error occurs.  */ 
    fprintf(stderr, "an error occurred in execvp\n"); 
    abort(); 
  } 
} 

int main() {
  /*  The argument list to pass to the "ls" command.  */ 
  char* arg_list[] = {
    "ls",     /* argv[0], the name of the program.  */ 
    "-l", 
    "/", 
    NULL      /* The argument list must end with a NULL.  */ 
  }; 

  /* Spawn a child process running the "ls" command. Ignore the 
     returned child process ID.  */ 
  pid_t pid = spawn("ls", arg_list); 

  /* Wait for the child process to complete.  */ 
  int child_status;
  waitpid(pid, &child_status, 0); 
  
  if (WIFEXITED(child_status)){
    printf ("the child process exited normally, with exit code %d\n", 
	     WEXITSTATUS(child_status)); 
  }else{ 
    printf("the child process exited abnormally\n"); 
  }
  
  return 0; 
}

请注意，参数列表必须以程序名称作为第一个参数，并且 **必须** 以 NULL 结尾，这表示列表的结尾。否则，execvp() 无法正常工作。

使用 dup2() 重定向标准输出

#include <stdio.h>
#include <sys/types.h>
#include <fcntl.h>
#include <unistd.h>
 
int main(){
  int overwrite = 0;
  int fd;
  if(overwrite){
    printf("open test.txt to overwrite.\n");
    fd =  open("test.txt", O_WRONLY | O_CREAT, S_IRWXU);
  }else{
    printf("open test.txt to append.\n");
    fd =  open("test.txt", O_WRONLY | O_APPEND | O_CREAT, S_IRWXU);
  }
  dup2 (fd, STDOUT_FILENO);
  printf("hello world!");
}

此示例以写模式打开一个文件（“test.txt”），并将标准输出与打开的文件同义 - 发送到标准输出的字节将转到该文件。“O_WRONLY | O_CREAT” 标记会导致文件以写模式打开，并且如果文件不存在，则创建文件。

有关每个打开文件的信息记录在操作系统管理的表中。每个条目对应于一个打开的文件，每个条目的索引是一个整数（文件描述符），它作为 open() 系统调用的返回值返回给打开文件的进程。标准输入、标准输出和标准错误的条目使用预定义的索引保留：STDIN_FILENO、STDOUT_FILENO 和 STDERR_FILENO（在 <unistd.h> 中定义）。dup2(index1, index2) 函数将复制 index1 处的条目内容到 index2 处的条目，这使得文件描述符 index2 与文件描述符 index1 同义。

在子进程中重定向标准输出

“dup2 (fd, STDOUT_FILENO)” 行（在前面的示例中）将当前（主）进程的标准输出重定向到打开的文件。如果您想将子进程的标准输出重定向到文件，则需要等到子进程创建 - 在 fork() 函数调用之后。以下示例演示了该想法。您将看到，来自父进程的“hello”消息仍然会发送到标准输出，但来自子进程的标准输出将被重定向到文件。

#include <stdio.h> 
#include <stdlib.h> 
#include <sys/types.h> 
#include <sys/wait.h>
#include <unistd.h> 
#include <sys/stat.h>
#include <fcntl.h>

int main() {
  /*  The argument list to pass to the "ls" command.  */ 
  char* arg_list[] = {
    "ls",     /* argv[0], the name of the program.  */ 
    "-l", 
    "/", 
    NULL      /* The argument list must end with a NULL.  */ 
  }; 

  /* Spawn a child process running the "ls" command. */
  pid_t child_pid = fork(); 

  if (child_pid == 0){
    /* This is the child process. */ 
    char * filename = "test.txt";
    int outfile = open(filename, O_CREAT | O_WRONLY, S_IRWXU);
    if (outfile == -1){
      fprintf(stderr, "Error: failed to create file %s\n", filename);
    }else{
      /* redirect the standard output from this process to the file. */
      if(dup2(outfile, STDOUT_FILENO) != STDOUT_FILENO){
        fprintf(stderr, "Error: failed to redirect standard output\n");
      }

      /* Now execute PROGRAM, searching for it in the path.  */ 
      execvp(arg_list[0],  arg_list); 
      /* The execvp  function returns only if an error occurs.  */ 
      fprintf(stderr, "an error occurred in execvp\n"); 
      abort(); 
    }
  }
  /* only the parent process executes the following code. */
  fprintf(stdout, "Hello from the parent process.\n");

  /* Wait for the child process to complete.  */ 
  int child_status;
  waitpid(child_pid, &child_status, 0); 
  
  if (WIFEXITED(child_status)){
    printf ("the child process exited normally, with exit code %d\n", 
	     WEXITSTATUS(child_status)); 
  }else{ 
    printf("the child process exited abnormally\n"); 
  }
  
  return  0; 
}

使用 pipe() 和 dup2() 创建管道

/* from Advanced Linux Programming (page 113) */
/* http://www.makelinux.net/alp/038.htm */
#include <stdio.h> 
#include <sys/types.h> 
#include <sys/wait.h> 
#include <unistd.h> 

int main () {
  int fds[2]; 
  pid_t pid; 

  /* Create a pipe. File descriptors for the two ends of the pipe are 
     placed in fds.  */ 
  pipe(fds); 

  printf("fds[0]=%d, fds[1]=%d\n", fds[0], fds[1]);

  /* Fork a child process.  */ 
  pid = fork(); 

  if (pid == (pid_t) 0) {
    /* This is the child process. Close our copy of the write end of 
       the file descriptor.  */ 
    close(fds[1]); 

    /* Connect the read end of the pipe to standard input.  */ 
    dup2(fds[0], STDIN_FILENO); 

    /* Replace the child process with the "sort" program.  */ 
    execlp("sort", "sort", NULL); 
  } else {
    /* This is the parent process.  */ 
    FILE* stream; 

    /* Close our copy of the read end of the file descriptor.  */ 
    close(fds[0]); 

    /* Connect the write end of the pipe to standard out, and write 
       to it.  */ 
    dup2(fds[1], STDOUT_FILENO);
    printf("This is a test.\n"); 
    printf("Hello, world.\n"); 
    printf("My dog has fleas.\n"); 
    printf("This program is great.\n"); 
    printf("One fish, two fish.\n"); 
    fflush(stdout);
    close(fds[1]);
    close(STDOUT_FILENO);

    /* Wait for the child process to finish.  */ 
    waitpid(pid, NULL, 0); 
    printf("parent process terminated.\n");
  } 
  return 0; 
}

连接两个命令的管道示例

#include <stdio.h> 
#include <stdlib.h> 
#include <sys/types.h> 
#include <unistd.h> 

int run_command(char** arg_list, int rd, int wd) {
  pid_t child_pid; 

  /* Duplicate this process. */ 
  child_pid = fork(); 

  if (child_pid != 0){
    /* This is the parent process. */ 
    return child_pid; 
  }else {
    if (rd != STDIN_FILENO){
      if(dup2(rd, STDIN_FILENO) != STDIN_FILENO){
	fprintf(stderr, "Error: failed to redirect standard input\n");
        return -1;
      }
    }

    if (wd != STDOUT_FILENO){
      printf("redirect stdout to %d.", wd);
      if(dup2(wd, STDOUT_FILENO) != STDOUT_FILENO){
        fprintf(stderr, "Error: failed to redirect standard output\n");
        return -1;
      }
    }
    /* Now execute PROGRAM, searching for it in the path.  */ 
    execvp (arg_list[0], arg_list); 
    /* The execvp  function returns only if an error occurs.  */ 
    fprintf(stderr, "an error occurred in execvp\n"); 
    abort(); 
  } 
} 

int main() {
  /*  The argument list to pass to the "ls" command.  */ 
  char* arg_list[] = {
    "ls",     /* argv[0], the name of the program.  */ 
    "-l", 
    "|",      /* the pipe symbol is at index 2 */
    "wc",
    "-l", 
    NULL      /* The argument list must end with a NULL.  */ 
  }; 
 
  int pipe_index = 2;
  int rd = STDIN_FILENO;
  int wd = STDOUT_FILENO;
  int fds[2];
  if (pipe(fds) != 0) {
    fprintf(stderr, "Error: unable to pipe command '%s'\n", arg_list1[0]);
    return -1;
  }
  
  wd = fds[1]; /*file descriptor for the write end of the pipe*/

  // delete the pipe symbol and insert a null to terminate the
  // first command's argument list
  args[pipe_index] = NULL;

  // run first command: read from STDIN and write to the pipe
  run_command(arg_list, rd, wd);
  close(fds[1]);

  rd = fds[0];
  wd = STDOUT_FILENO;

  // run the second command: read from the pipe and write to STDOUT
  // the argument for this command starts at pipe_index+1
  run_command(arg_list+pipe_index+1, rd, wd);

  fprintf(stderr, "done with main program\n"); 
  return 0; 
}

在此示例中，原始参数列表在管道符号处被拆分为两个列表，管道符号被替换为 null 值。这使我们能够使用两个参数列表来运行由管道连接的两个独立命令/程序。

进程是正在执行的程序。它是操作系统管理的实体的隐喻。进程具有自己的地址空间以及操作系统管理数据结构中的其他信息。

输入和输出

输入和输出可能是 Unix 中最普遍的概念。在 Unix 中，一切都是文件，这是有意为之，以便程序可以以通用方式与不同的设备交互。可以将文件作为程序的输入，也可以从程序的输出创建文件。

每个进程都有一组输入和输出流。虽然进程可以拥有开发者想要的任意数量的流，但所有进程至少拥有 3 个流。这些流被称为标准输入、标准输出和标准错误。许多程序使用这些简单的流，以便用户可以轻松地操作它们。例如，程序 cat 和 grep。cat 将给定文件发送到标准输出（通常是您的终端，除非另有说明），grep 在其标准输入中搜索模式。如果输入命令 `$cat file.txt | grep "hi"`，这将搜索文件中给定的文本 "hi"。

文件系统

文件系统概念

文件抽象 - 字节流，含义由文件系统用户指定文件系统用户 - 最终用户（人类）和直接用户（程序，例如应用程序或 shell）用户视角 - 一组系统调用，例如 creat、open、close、seek、delete ... 文件属性 - 元数据：所有者、大小、时间戳 ... 目录抽象 - 文件和目录列表，将文件/目录的名称映射到找到数据所需的信息。绝对路径和相对路径目录操作 - 创建、删除、打开、关闭、重命名、链接、取消链接

文件系统实现

布局：分区、引导块、超级块、... 磁盘块分配

 contiguous allocation
 linked list allocation: the first word in each block is used as a pointer to the next one.
 file allocation table: linked list allocation using a table in memory. 
 i-node (index-node): an i-node is in memory when the corresponding file is open. With a i-node design we can calculate the largest possible file size.

目录实现：i-节点、长文件名文件共享：符号链接与硬链接磁盘块大小：权衡和折衷，浪费磁盘空间与性能（数据速率）跟踪空闲块：链接列表与位图系统备份：缓存：在路径下查找文件的步骤

资源