从本文开始,我们进入内存部分的学习。首先会接着前面的任务task_struct讲解任务空间管理结构体mm_struct,并简单介绍物理内存和虚拟内存的相关知识,关于详细的基础知识和概念可以参照CSAPP一书,这里不会做过多的赘述,而是默认在已了解其映射关系的基础上进行的学习。在后文中,会继续介绍物理内存的管理以及用户态和内核态的内存映射。
对于一个进程来说,需要考虑用户态和内核态两部分需要存储在内核内的各个结构
用户态包括
1struct mm_struct *mm;
2struct mm_struct *active_mm;
3/* Per-thread vma caching: */
4struct vmacache vmacache;
其中mm_struct结构体也较为复杂,我们将分步介绍。首先我们来看看内核态和用户态的地址划分。这里highest_vm_end存储当前虚拟内存地址的最大地址,而task_size则是用户态的大小。
1struct mm_struct {
2......
3 unsigned long task_size; /* size of task vm space */
4 unsigned long highest_vm_end; /* highest vma end address */
5......
6}
task_size定义如下,从注释可见用户态分配了4G虚拟内存中的3G空间,而64位因为空间巨大因此在内核态和用户态中间还保留了空闲区域进行隔离,用户态仅使用47位,即128TB。内核态同样分配128TB,位于最高位。
1#ifdef CONFIG_X86_32
2/*
3 * User space process size: 3GB (default).
4 */
5#define TASK_SIZE PAGE_OFFSET
6#define TASK_SIZE_MAX TASK_SIZE
7/*
8config PAGE_OFFSET
9 hex
10 default 0xC0000000
11 depends on X86_32
12*/
13#else
14/*
15 * User space process size. 47bits minus one guard page.
16*/
17#define TASK_SIZE_MAX ((1UL << 47) - PAGE_SIZE)
18#define TASK_SIZE (test_thread_flag(TIF_ADDR32) ?
19 IA32_PAGE_OFFSET : TASK_SIZE_MAX)
20......
在用户态,mm_struct有着以下成员变量
1struct mm_struct {
2......
3 unsigned long mmap_base; /* base of mmap area */
4 unsigned long mmap_legacy_base; /* base of mmap area in bottom-up allocations */
5......
6 unsigned long hiwater_rss; /* High-watermark of RSS usage */
7 unsigned long hiwater_vm; /* High-water virtual memory usage */
8 unsigned long total_vm; /* Total pages mapped */
9 unsigned long locked_vm; /* Pages that have PG_mlocked set */
10 atomic64_t pinned_vm; /* Refcount permanently increased */
11 unsigned long data_vm; /* VM_WRITE & ~VM_SHARED & ~VM_STACK */
12 unsigned long exec_vm; /* VM_EXEC & ~VM_WRITE & ~VM_STACK */
13 unsigned long stack_vm; /* VM_STACK */
14 spinlock_t arg_lock; /* protect the below fields */
15 unsigned long start_code, end_code, start_data, end_data;
16 unsigned long start_brk, brk, start_stack;
17 unsigned long arg_start, arg_end, env_start, env_end;
18 unsigned long saved_auxv[AT_VECTOR_SIZE]; /* for /proc/PID/auxv */
19......
20}
根据这些成员变量,我们可以规划出用户态中各个部分的位置,但是我们还需要一个结构体描述这些区域的属性,即vm_area_struct
1struct mm_struct {
2......
3 struct vm_area_struct *mmap; /* list of VMAs */
4 struct rb_root mm_rb;
5......
6}
vm_area_struct的具体结构体定义如下所示,实际是通过vm_next和vm_prev组合而成的双向链表,即通过一系列的vm_area_struct来表述一个进程在用户态分配的各个区域的内容。
1/*
2 * This struct defines a memory VMM memory area. There is one of these
3 * per VM-area/task. A VM area is any part of the process virtual memory
4 * space that has a special rule for the page-fault handlers (ie a shared
5 * library, the executable area etc).
6 */
7struct vm_area_struct {
8 /* The first cache line has the info for VMA tree walking. */
9 unsigned long vm_start; /* Our start address within vm_mm. */
10 unsigned long vm_end; /* The first byte after our end address
11 within vm_mm. */
12 /* linked list of VM areas per task, sorted by address */
13 struct vm_area_struct *vm_next, *vm_prev;
14 struct rb_node vm_rb;
15 /*
16 * Largest free memory gap in bytes to the left of this VMA.
17 * Either between this VMA and vma->vm_prev, or between one of the
18 * VMAs below us in the VMA rbtree and its ->vm_prev. This helps
19 * get_unmapped_area find a free area of the right size.
20 */
21 unsigned long rb_subtree_gap;
22 /* Second cache line starts here. */
23 struct mm_struct *vm_mm; /* The address space we belong to. */
24 pgprot_t vm_page_prot; /* Access permissions of this VMA. */
25 unsigned long vm_flags; /* Flags, see mm.h. */
26 /*
27 * For areas with an address space and backing store,
28 * linkage into the address_space->i_mmap interval tree.
29 */
30 struct {
31 struct rb_node rb;
32 unsigned long rb_subtree_last;
33 } shared;
34 /*
35 * A file's MAP_PRIVATE vma can be in both i_mmap tree and anon_vma
36 * list, after a COW of one of the file pages. A MAP_SHARED vma
37 * can only be in the i_mmap tree. An anonymous MAP_PRIVATE, stack
38 * or brk vma (with NULL file) can only be in an anon_vma list.
39 */
40 struct list_head anon_vma_chain; /* Serialized by mmap_sem & page_table_lock */
41 struct anon_vma *anon_vma; /* Serialized by page_table_lock */
42 /* Function pointers to deal with this struct. */
43 const struct vm_operations_struct *vm_ops;
44 /* Information about our backing store: */
45 unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE
46 units */
47 struct file * vm_file; /* File we map to (can be NULL). */
48 void * vm_private_data; /* was vm_pte (shared mem) */
49 atomic_long_t swap_readahead_info;
50#ifndef CONFIG_MMU
51 struct vm_region *vm_region; /* NOMMU mapping region */
52#endif
53#ifdef CONFIG_NUMA
54 struct mempolicy *vm_policy; /* NUMA policy for the VMA */
55#endif
56 struct vm_userfaultfd_ctx vm_userfaultfd_ctx;
57} __randomize_layout;
对一个mm_struct来说,其众多的vm_area_struct会在ELF文件加载,即load_elf_binary()时构造。该函数在解析ELF文件格式后,就会进行内存映射的建立,主要包括
1static int load_elf_binary(struct linux_binprm *bprm)
2{
3......
4 setup_new_exec(bprm);
5......
6 /* Do this so that we can load the interpreter, if need be. We will
7 change some of these later */
8 retval = setup_arg_pages(bprm, randomize_stack_top(STACK_TOP),
9 executable_stack);
10......
11 error = elf_map(bprm->file, load_bias + vaddr, elf_ppnt,
12 elf_prot, elf_flags, total_size);
13......
14 /* Calling set_brk effectively mmaps the pages that we need
15 * for the bss and break sections. We must do this before
16 * mapping in the interpreter, to make sure it doesn't wind
17 * up getting placed where the bss needs to go.
18 */
19 retval = set_brk(elf_bss, elf_brk, bss_prot);
20......
21 elf_entry = load_elf_interp(&loc->interp_elf_ex,
22 interpreter,
23 &interp_map_addr,
24 load_bias, interp_elf_phdata);
25......
26 current->mm->end_code = end_code;
27 current->mm->start_code = start_code;
28 current->mm->start_data = start_data;
29 current->mm->end_data = end_data;
30 current->mm->start_stack = bprm->p;
31......
32}
由于32位和64位系统空间大小差距过大,因此结构上也有一些区别。我们这里分别讨论二者的结构。
内核态的虚拟空间和进程是无关的,即所有进程通过系统调用进入内核后,看到的虚拟地址空间是一样的。如下图所示为32位内核态虚拟空间分布图。
image
1、直接映射区
前896M为直接映射区,该区域用于和物理内存进行直接映射。虚拟内存地址减去 3G,就得到对应的物理内存的位置。在内核里面,有两个宏:
2、high_memory
高端内存的名字来源于x86架构中将物理地址空间划分三部分:ZONE_DMA、ZONE_NORMAL和ZONE_HIGHMEM。ZONE_HIGHMEM即为高端内存。
高端内存是内存管理模块看待物理内存的称谓,指的也即896M直接映射区上面的区域。内核中除了内存管理模块外,其余均操作虚拟地址。而内存管理模块会直接操作物理地址,进行虚拟地址的分配和映射。其存在的意义是以32位系统有限的内核空间去访问无限的物理内存空间:借用这段逻辑地址空间,建立映射到想访问的那段物理内存(即填充内核页表),临时用一会,用完后归还。
3、内核动态映射空间(noncontiguous memory allocation)
在VMALLOC_START和VMALLOC_END之间的区域称之为内核动态映射空间,对应于用户态进程malloc申请内存一样,在内核态可以通过vmalloc来申请。内核态有单独的页表管理,和用户态分开。
4、持久内核映射区(permanent kernel mapping)
PKMAP_BASE 到 FIXADDR_START 的空间称为持久内核映射,这个地址范围是 4G-8M 到 4G-4M 之间。使用 alloc_pages() 函数的时候,在物理内存的高端内存得到 struct page 结构,可以调用 kmap() 将其映射到这个区域。因为允许永久映射的数量有限,当不再需要高端内存时,应该解除映射,这可以通过kunmap()函数来完成。
5、固定映射区
FIXADDR_START 到 FIXADDR_TOP(0xFFFF F000) 的空间,称为固定映射区域,主要用于满足特殊需求。
6、临时映射区(temporary kernel mapping)
临时内核映射通过kmap_atomic和kunmap_atomic实现,主要用于当需要写入物理内存或主存时的操作,如写入文件时使用。
64位内核态因为空间巨大,所以不需要像32位一样精打细算,直接分出很多的空闲区域做保护,结构如下图所示
image
本文比较详细的分析了内存在用户态和内核态的结构,以此为基础,后文可以开始分析内存的管理、映射了。
嵌入式物联网需要学的东西真的非常多,千万不要学错了路线和内容,导致工资要不上去!
分享大家一个资料包,差不多150多G。里面学习内容、面经、项目都比较新也比较全!
扫码进群领资料
转载自:玩转linux内核
文章来源于任务空间管理
原文链接:https://zhuanlan.zhihu.com/p/440468631
页面更新:2024-04-14
本站资料均由网友自行发布提供,仅用于学习交流。如有版权问题,请与我联系,QQ:4156828
© CopyRight 2020-2024 All Rights Reserved. Powered By 71396.com 闽ICP备11008920号-4
闽公网安备35020302034903号