前言

这次labPart A让操作系统支持进程(单进程),Part B处理了异常中断,使其能在kernel态用户态进行切换。我觉得这次lab不算很难,可能是因为经过了前面两个lab洗礼的缘故。但还是在中断跳转那里卡了好一会,不会写汇编啊。

Part A

Jos中 我们用Env结构体来描述进程,关于Env,讲义中已经很清楚了,这里不赘述。关键点是通过envs数组和env_free_list来维护数组,这里需要注意的是env_free_list,不是像之前free_page_list那样是反向。这里需要和envs的顺序相同。

Exercise 1

Modify mem_init() in kern/pmap.c to allocate and map the envs array. This array consists of exactly NENV instances of the Env structure allocated much like how you allocated the pages array. Also like the pages array, the memory backing envs should also be mapped user read-only at UENVS (defined in inc/memlayout.h) so user processes can read from this array.

1
2
envs = (struct Env *) boot_alloc(NENV * sizeof(struct Env));
boot_map_region(kern_pgdir, UENVS, PTSIZE, PADDR(envs), PTE_U);

这个应该轻车熟路了,就是为envs分配内存,并开启虚拟映射。

Exercise 2

In the file env.c, finish coding the following functions:

  • env_init()
    Initialize all of the Env structures in the envs array and add them to the env_free_list. Also calls env_init_percpu, which configures the segmentation hardware with separate segments for privilege level 0 (kernel) and privilege level 3 (user).
  • env_setup_vm()
    Allocate a page directory for a new environment and initialize the kernel portion of the new environment’s address space.
  • region_alloc()
    Allocates and maps physical memory for an environment
  • load_icode()
    You will need to parse an ELF binary image, much like the boot loader already does, and load its contents into the user address space of a new environment.
  • env_create()
    Allocate an environment with env_alloc and call load_icode to load an ELF binary into it.
  • env_run()
    Start a given environment running in user mode.

env_init() 初始化envs,并且连接env_free_list。和之前的page_init做法基本一样,除了顺序相反。

1
2
3
4
5
6
7
8
9
10
11
void env_init(void) {
// Set up envs array
for (int i = NENV - 1; i >= 0; i--) {
envs[i].env_id = 0;
envs[i].env_status = ENV_FREE;
envs[i].env_link = env_free_list;
env_free_list = &envs[i];
}
// Per-CPU part of the initialization
env_init_percpu();
}

env_setup_vm为进程分配页目录,这里做法是copykernel的页目录。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
static int env_setup_vm(struct Env *e) {
int i;
struct PageInfo *p = NULL;

// Allocate a page for the page directory
if (!(p = page_alloc(ALLOC_ZERO)))
return -E_NO_MEM;

// LAB 3: Your code here.
e->env_pgdir = page2kva(p);
p->pp_ref++;
memcpy(e->env_pgdir, kern_pgdir, PGSIZE);

// UVPT maps the env's own page table read-only.
// Permissions: kernel R, user R
e->env_pgdir[PDX(UVPT)] = PADDR(e->env_pgdir) | PTE_P | PTE_U;
return 0;
}

region_alloc()为用户空间分配页。类似于lab2 中的boot_map_region

1
2
3
4
5
6
7
8
9
10
11
12
13
static void
region_alloc(struct Env *e, void *va, size_t len)
{
void *va_t = ROUNDDOWN(va, PGSIZE);
void *end = ROUNDUP(va + len, PGSIZE);
for(; va_t < end;va_t += PGSIZE){
struct PageInfo *pp = page_alloc(1);
if(pp == NULL){
panic("region_alloc:page alloc failed!\n");
}
page_insert(e->env_pgdir, pp, va_t, PTE_U | PTE_W);
}
}

load_icode()这个函数注释好多,一开始看了半天。其实说白了就是把elf程序加载到用户内存空间。正常来讲用户程序应该从磁盘上读取,但是目前jos还没有文件系统。mit直接链接了一些用户程序到kernel中。所以这里不需要读取,更加方便一点。具体我们可以参考bootloader的做法。因为需要对用户空间进行内存操作,这里需要用lcr3()切换页目录。最后需要注意的是设置进程的入口点为这个程序的入口点。最后的最后是为用户程序栈初始化分配一页。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
static void load_icode(struct Env *e, uint8_t *binary) {
struct Elf *elf = (struct Elf *) binary;
if (elf->e_magic != ELF_MAGIC)
panic("load_icode: not ELF executable.");
struct Proghdr *ph = (struct Proghdr *) (elf->e_phoff + binary);
struct Proghdr *eph = ph + elf->e_phnum;
lcr3(PADDR(e->env_pgdir));
for(; ph < eph; ph++){
if(ph->p_type == ELF_PROG_LOAD){
region_alloc(e, (void *) ph->p_va, ph->p_memsz);
memset((void *) ph->p_va, 0, ph->p_memsz);
memcpy((void *) ph->p_va, binary + ph->p_offset, ph->p_filesz);
}
}
lcr3(PADDR(kern_pgdir));
e->env_tf.tf_eip = elf->e_entry;
region_alloc(e, (void *) (USTACKTOP - PGSIZE), PGSIZE);
}

env_create这个简单,综合前面的函数,先创建进程,然后加载用户程序。

1
2
3
4
5
6
void env_create(uint8_t *binary, enum EnvType type) {
struct Env *e;
env_alloc(&e, 0);
load_icode(e, binary);
e->env_type = type;
}

env_run运行进程。这个也比较简单,照着注释来就行。切换当前进程为新的进程。切换地址空间。最后调用env_pop_tf来保存现场,并且跳转到用户程序的入口点,不返回。

1
2
3
4
5
6
7
8
9
10
11
void env_run(struct Env *e) {
if(curenv != NULL){
if(curenv->env_status == ENV_RUNNING)
curenv->env_status = ENV_RUNNABLE;
}
curenv = e;
curenv->env_status = ENV_RUNNING;
curenv->env_runs ++;
lcr3(PADDR(e->env_pgdir));
env_pop_tf(&e->env_tf);
}

到这里进程建立完成,操作系统完成了从kernel切换到用户态,但是hello world,依旧运行不起来,因为目前操作系统无法处理中断。也就是说无法从用户态切换回kernel,当调用printf,会引起系统调用中断。

中断、异常和系统调用

中断、异常和系统调用是用户程序或者外部设备和kernel进行交互的方式。比如说当敲击键盘时会产生中断,让操作系统知道这时候有字符可读。在比如用户程序运行时,发生错误,比如除0,无法运行下去,这会产生异常,让kernel来处理,系统调用就更不用说了,每时每刻都在发生,比如printf就是一个系统调用。这里说的中断、异常和系统调用,每一种都有些细微的不同,其实根据上述的例子就能看出来,中断是异步的,异常是同步的,系统调用同步异步都有可能。之后文章中说的中断,是广义上的中断,也就是一个统称,不细分为中断、异常和系统调用。操作系统用int n指令来说明中断产生,当中断产生时,操作系统会根据中断向量表,来索引n,然后跳到相应的处理函数。

此外这里还有一个叫做TSS的东西需要注意。就是用来保护现场的,从用户态切换到kernel的时候。

Exercise 4

这个exercise主要工作是建立idt表,以及注册相应的映射。在做这个之前,一定要仔细阅读 Exercise 3提的手册,务必弄清楚详细的中断机制。其实我觉得这里太偏硬件了,我更推荐看CSAPP关于异常那一章。

Edit trapentry.S and trap.c and implement the features described above. The macros TRAPHANDLER and TRAPHANDLER_NOEC in trapentry.S should help you, as well as the T_* defines in inc/trap.h. You will need to add an entry point in trapentry.S (using those macros) for each trap defined in inc/trap.h, and you’ll have to provide_alltraps which the TRAPHANDLER macros refer to. You will also need to modify trap_init() to initialize the idt to point to each of these entry points defined in trapentry.S; the SETGATE macro will be helpful here.

Your _alltraps should:

  1. push values to make the stack look like a struct Trapframe
  2. load GD_KD into %ds and %es
  3. pushl %esp to pass a pointer to the Trapframe as an argument to trap()
  4. call trap (can trap ever return?)

Consider using the pushal instruction; it fits nicely with the layout of the struct Trapframe.

首先在trapentry.S用预先定义的两个宏来定义中断。这个需要查看intel手册,因为有些中断需要压入错误码。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
TRAPHANDLER_NOEC(t_divide, T_DIVIDE)
TRAPHANDLER_NOEC(t_debug, T_DEBUG)
TRAPHANDLER_NOEC(t_nmi, T_NMI)
TRAPHANDLER_NOEC(t_brkpt, T_BRKPT)
TRAPHANDLER_NOEC(t_oflow, T_OFLOW)
TRAPHANDLER_NOEC(t_bound, T_BOUND)
TRAPHANDLER_NOEC(t_illop, T_ILLOP)
TRAPHANDLER_NOEC(t_device, T_DEVICE)
TRAPHANDLER(t_dblflt, T_DBLFLT)
TRAPHANDLER(t_tss, T_TSS)
TRAPHANDLER(t_segnp, T_SEGNP)
TRAPHANDLER(t_stack, T_STACK)
TRAPHANDLER(t_gpflt, T_GPFLT)
TRAPHANDLER(t_pgflt, T_PGFLT)
TRAPHANDLER_NOEC(t_fperr, T_FPERR)
TRAPHANDLER(t_align, T_ALIGN)
TRAPHANDLER_NOEC(t_mchk, T_MCHK)
TRAPHANDLER_NOEC(t_simderr, T_SIMDERR)
TRAPHANDLER_NOEC(t_syscall, T_SYSCALL)
TRAPHANDLER_NOEC(irq_timer, IRQ_OFFSET + IRQ_TIMER)
TRAPHANDLER_NOEC(irq_kbd, IRQ_OFFSET + IRQ_KBD)
TRAPHANDLER_NOEC(irq_serial, IRQ_OFFSET + IRQ_SERIAL)
TRAPHANDLER_NOEC(irq_spurious, IRQ_OFFSET + IRQ_SPURIOUS)
TRAPHANDLER_NOEC(irq_ide, IRQ_OFFSET + IRQ_IDE)
TRAPHANDLER_NOEC(irq_error, IRQ_OFFSET + IRQ_ERROR)

然后需要在_alltraps 设置好trapframe,最后调用trap来分发中断。这个按照注释和说明来做即可

1
2
3
4
5
6
7
8
9
10
11
12
_alltraps:
# Build trap frame.
pushl %ds
pushl %es
pushal

movw $(GD_KD), %ax
movw %ax, %ds
movw %ax, %es

pushl %esp
call trap

最后在idt_init()设立IDT表,并设立相应的权限。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
void trap_init(void) {
extern struct Segdesc gdt[];
SETGATE(idt[T_DIVIDE], 0, GD_KT, t_divide, 0);
SETGATE(idt[T_DEBUG], 0, GD_KT, t_debug, 0);
SETGATE(idt[T_NMI], 0, GD_KT, t_nmi, 0);
SETGATE(idt[T_BRKPT], 0, GD_KT, t_brkpt, 3);
SETGATE(idt[T_OFLOW], 0, GD_KT, t_oflow, 0);
SETGATE(idt[T_BOUND], 0, GD_KT, t_bound, 0);
SETGATE(idt[T_ILLOP], 0, GD_KT, t_illop, 0);
SETGATE(idt[T_DEVICE], 0, GD_KT, t_device, 0);
SETGATE(idt[T_DBLFLT], 0, GD_KT, t_dblflt, 0);
SETGATE(idt[T_TSS], 0, GD_KT, t_tss, 0);
SETGATE(idt[T_SEGNP], 0, GD_KT, t_segnp, 0);
SETGATE(idt[T_STACK], 0, GD_KT, t_stack, 0);
SETGATE(idt[T_GPFLT], 0, GD_KT, t_gpflt, 0);
SETGATE(idt[T_PGFLT], 0, GD_KT, t_pgflt, 0);
SETGATE(idt[T_FPERR], 0, GD_KT, t_fperr, 0);
SETGATE(idt[T_ALIGN], 0, GD_KT, t_align, 0);
SETGATE(idt[T_MCHK], 0, GD_KT, t_mchk, 0);
SETGATE(idt[T_SIMDERR], 0, GD_KT, t_simderr, 0);
SETGATE(idt[T_SYSCALL], 0, GD_KT, t_syscall, 3);
SETGATE(idt[IRQ_OFFSET + IRQ_TIMER], 0, GD_KT, irq_timer, 0);
SETGATE(idt[IRQ_OFFSET + IRQ_KBD], 0, GD_KT, irq_kbd, 0);
SETGATE(idt[IRQ_OFFSET + IRQ_SERIAL], 0, GD_KT, irq_serial, 0);
SETGATE(idt[IRQ_OFFSET + IRQ_SPURIOUS], 0, GD_KT, irq_spurious, 0);
SETGATE(idt[IRQ_OFFSET + IRQ_IDE], 0, GD_KT, irq_ide, 0);
SETGATE(idt[IRQ_OFFSET + IRQ_ERROR], 0, GD_KT, irq_error, 0);
// Per-CPU setup
trap_init_percpu();
}

Part B

Exercise 5 && Exercise 6

Modify trap_dispatch() to dispatch page fault exceptions to page_fault_handler(). You should now be able to get make grade to succeed on the faultread,faultreadkernel, faultwrite, and faultwritekernel tests. If any of them don’t work, figure out why and fix them.

Modify trap_dispatch() to make breakpoint exceptions invoke the kernel monitor.

这两个比较简单,我就放在一起了。就是单纯的分发中断处理。没什么好讲的。

1
2
3
4
5
6
7
8
switch(tf->tf_trapno){
case T_PGFLT:
page_fault_handler(tf);
return;
case T_BRKPT:
monitor(tf);
return;
}

Exercise 7

Add a handler in the kernel for interrupt vector T_SYSCALL. You will have to edit kern/trapentry.S and kern/trap.c’s trap_init(). You also need to changetrap_dispatch() to handle the system call interrupt by calling syscall() (defined in kern/syscall.c) with the appropriate arguments, and then arranging for the return value to be passed back to the user process in %eax. Finally, you need to implement syscall() in kern/syscall.c. Make sure syscall() returns -E_INVAL if the system call number is invalid. You should read and understand lib/syscall.c (especially the inline assembly routine) in order to confirm your understanding of the system call interface. Handle all the system calls listed in inc/syscall.h by invoking the corresponding kernel function for each call.

添加系统调用处理。完成这个exercise。就能够完整的运行hello world了。之前建立idt表的时候,我已经设定好系统调用的映射了。所以这里直接处理trap_dispatch()就行。真正的系统调用触发在/lib/syscall.c 中,就是如下这条汇编语句

1
2
3
4
5
6
7
8
9
10
asm volatile("int %1\n"
: "=a" (ret)
: "i" (T_SYSCALL),
"a" (num),
"d" (a1),
"c" (a2),
"b" (a3),
"D" (a4),
"S" (a5)
: "cc", "memory");

系统调用 传入需要调用函数号,以及参数。所以当分发系统调用是,只要按照说明传入相应的参数,并且在/kern/syscall.c中按照函数号,分发下去即可。最后把返回值保存在eax中。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
case T_SYSCALL:
r = syscall(
tf->tf_regs.reg_eax,
tf->tf_regs.reg_edx,
tf->tf_regs.reg_ecx,
tf->tf_regs.reg_ebx,
tf->tf_regs.reg_edi,
tf->tf_regs.reg_esi);

if (r < 0) {
panic("trap_dispatch: %e", r);
}
tf->tf_regs.reg_eax = r;
return;

/kern/syscall.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
int32_t
syscall(uint32_t syscallno, uint32_t a1, uint32_t a2, uint32_t a3, uint32_t a4, uint32_t a5) {
switch (syscallno) {
case SYS_cputs:
sys_cputs((const char *) a1,(size_t) a2);
return 0;
case SYS_cgetc:
return sys_cgetc();
case SYS_getenvid:
return sys_getenvid();
case SYS_env_destroy:
return sys_env_destroy((envid_t)a1);
default:
return -E_INVAL;
}
}

Exercise 8

Add the required code to the user library, then boot your kernel. You should see user/hello print “hello, world” and then print “i am environment 00001000”.user/hello then attempts to “exit” by calling sys_env_destroy() (see lib/libmain.c and lib/exit.c). Since the kernel currently only supports one user environment, it should report that it has destroyed the only environment and then drop into the kernel monitor.

这个也简单,即设置当前活动的进程。

1
thisenv = envs + ENVX(sys_getenvid ());

Exercise 9

Change kern/trap.c to panic if a page fault happens in kernel mode.

Hint: to determine whether a fault happened in user mode or in kernel mode, check the low bits of the tf_cs.

Read user_mem_assert in kern/pmap.c and implement user_mem_check in that same file.

内存保护。这个也很清楚,照着说明来即可。主要就是检查标志位。用户程序不能访问kernel的内存。以及在kernel中page fault,需要特别报错。

/kern/trap.c

1
2
3
4
//page_fault_handler
if (tf->tf_cs == GD_KT) {
panic("page_fault_handler: page fault in kernel mode");
}

‘/kern/pmap.c’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
int user_mem_check(struct Env *env, const void *va, size_t len, int perm) {
void *vat =(void *) va;
void *end =(void *)va + len;
int p = perm | PTE_P;
pte_t *pte;
for (; vat < end; vat = ROUNDDOWN(vat+PGSIZE, PGSIZE)) {
if ((uint32_t)vat > ULIM) {
user_mem_check_addr =(uintptr_t) vat;
return -E_FAULT;
}
page_lookup(env->env_pgdir, vat, &pte);
if (!(pte && ((*pte & p) == p))) {
user_mem_check_addr = (uintptr_t) vat;
return -E_FAULT;
}
}
return 0;
}

/kern/syscall.c 最后需要在sys_cputs 添加检查,因为只有这个调用访问到地址。

1
2
3
4
5
6
7
static void sys_cputs(const char *s, size_t len) {
// Check that the user has permission to read memory [s, s+len).
// Destroy the environment if not.
user_mem_assert(curenv, s, len, PTE_U | PTE_W);
// Print the string supplied by the user.
cprintf("%.*s", len, s);
}

至此lab3结束