Skip to main content Link Menu Expand (external link) Document Search Copy Copied

CSE 351


Lecture 1 - June 21

  • We will cover three main topics:
    • Data
    • Programs
    • Scale
  • The prefix for binary is: 0b, hex prefix is: 0x
  • With $n$ binary digits we can represent $2^n$ things
    • With $n$ digits of base $b$ we can represent $b^n$ things
  • ASCII characters are one byte, representing 256 characters
    • lol we have so much more memory now, hello UTF encoding

Lecture 2 - June 23

  • The CPU executes instructions which are also stored in memory
  • CSE 351 is about data movement - how is memory stored or found?
  • Base-2 is the default (although DNA/quantum computing is researched)
  • We manipulate memory as chunks of bytes (even if it doesn’t need all bits - hello bools!)
  • Definition “word”: a 8-byte chunk of memory
  • x86-64 systems use 64-bit words
    • The potential address space is 2^64 addresses
  • Pointers are also stored in memory like any other data
  • Big-endian: least signifcant byte stored in the largest address
    • Not commonly used
  • Little-endian: least significant byte stored in smallest address
    • x86, x86-64
  • Bi-endian (used on ARM/M1)
  • The pointer always points to the lowest address of the data

Lecture 3 - June 26

  • Pointers are defined: type* ptr;
    • The type encodes size information, the * designates it is a pointer
  • To get the address of a variable, write: type* ptr = &var;
    • The address must be stored in a pointer
  • The “dereference” operator * is used to “grab” the data a pointer points to
    • i.e. int num = *ptr;
  • NULL is a special pointer to nothing
  • Addition of a pointer and a scalar scales the scalar by the size of the type
  • Subtraction of two pointers the value is scaled (divided) by the size of the type
    • We can use this to find the amount of types between two pointers
  • Arrays are sets of contiguous locations in memory that store the same type of data object
    • C declaration: type name[size]
    • Indexing actually works with: *(array_name + n)
  • “Strings” in C are actually just arrays of chars terminated by the null character
    • i.e. char str[] = "hi" allocates an array of size three
    • Null character is backslash 0
      • not every array in C has this null character
  • In C, declaration does not mean initialization: there can be garbage data

Lecture 4 - June 28

  • 0 is false, everything else is true
  • Bitwise operators:
    • AND: &
    • OR: |
    • XOR: ^
    • NOT: ~
  • Logical operators:
    • AND: &&
    • OR: ||
    • NOT: !
  • Unsigned integers work as expected
  • Signed integers use: Two’s Complement
    • The weight of the most significant bit is negative
    • This means -x == ~x + 1
  • Bitmasking allows us to change our inputs as we want

Lecture 5 - June 30

  • Signed/unsigned integers are represented in memory the same way
  • Type casting includes:
    • Changes in bit-widths
    • Changes in interpretation
    • Full changes in representations
  • Implicit casting happens when there is a type mismatch, but a well defined conversion between the two types
  • Explicit casts: (new_type) expression
  • When casting between bit-widths, we must extend the data
    • Zero extension: padding with extra zeros on the left (preserves the value)
    • Sign extension: pad signed data with copies on most signficant bit (preserves sign and value)
  • Operations between signed and unsigned values implicitly cast all values to unsigned representations
  • Fixed-with binary arithmatic is modular
    • This results in integer overflow
  • Bit shifting: move all bits a specified amount in a direction, filling in what is lost
    • Left shift: x << n fills in n zeros on the right
    • Logical right shift: x >> n where x is unsigned, fills in zeros on the left
    • Arithmetic right shift: x >> n where x is signed, fills in with copies of most significant bit on the left
  • Bit shifting can be interpreted as integer multiplication or division by powers of two
  • To force unsigned ints, we must append number with u or U

Lecture 6 - July 5

  • Masking in C: #define SEARCH replace
  • IEEE 754 floating point encoding uses normalized scientific binary notation:
    • This includes the sign, mantissa, and exponent
    • 32 and 64 bit encodings for float and long
  • Sign is represented using just a single bit
  • Exponent is biased notation:
    • The value is shifted by: $2^{w - 1} - 1$ where $w$ is the width of the exponent field
    • Bias is a zero followed by all ones: 0b01...1
    • Bias is an unsigned int
    • Positive is one bigger
  • Mantissa:
  • 32 bit encoding (float):
    • first bit is sign
    • next 8 bits are exponent
    • last 23 bits are mantissa
  • 64 bit encoding (long):
    • first bit is sign
    • next 11 bits are exponent
    • last 52 bits are mantissa
  • Steps to go from FP -> decimal:
    1. Append the bits of $M$ to implicit leading 1 to form the mantissa.
    2. Multiply the mantissa by $2^{\text{E - bias}}$
    3. Multiply the sign $(-1)^{\text{S}}$
    4. Multiply out the exponent by shifting the binary point.
    5. Convert from binary to decimal.
  • Steps to go from decimal -> FP:
    1. Convert decimal to binary.
    2. Convert binary to normalized scientific notation.
    3. Encode the sign as $S$ (0 or 1).
    4. Encode $E$ as exponent + bias in unsigned.
    5. Encode the 23 bits after the binary point (padding with trailing 0’s as needed) into $M$
  • Precision is the number of bits in a computer word to represent a value
    • Gives capacity for accuracy
  • Accuracy is a measure of the difference between the true and computer representations of a value
  • If the exponent and mantissa are zeros, the float is zero
  • If the exponent is all ones and mantissa is zero, it is positive or negative zero
  • If the exponent is all ones and mantissa is not zero, it is not a number (NaN)
    • The mantissa will store some value about what went wrong
  • When the exponent is zero and the mantissa is not zero, we have denormalized numbers
    • This fills in the gaps between 0 and a real number
  • FP errors:
    • Overflow: exp too large
    • Underflow: exp too small
    • Rounding: between possible numbers

Lecture 7 - July 7

  • Instruction set architecture (ISA) is the processor spec necessary to write assembly code
    • System state: all information that defines culmination of past instructions
    • Instruction set: list and format of all instructions the CPU can execute
    • The effect on the system state of each instruction
  • The spectrum of ISAs falls between CISC and RISC:
    • CISC: complex instruction set computer, easy for programmers but harder for hardware
    • RISC: reduced instruction set computer, easier for hardware but low-level programming
  • x86-64 is CISC, we will only look at integral data and use “AT&T” syntax
  • In x86-64 assembly, instructions are specified using a short instruction name followed by 0-3 operands
    • We will primarily look at instructions with 1-2 operands
    • “AT&T” syntax typically follows:
      • instr op
      • instr src, dest
  • Main types of x86-64 instructions:
    • Data transfer: copy data from one place to another
    • Arithmetic and logical operations
    • Control flow
  • Each instruction includes a “size specifier” letter appended:
    • b for byte
    • w word (two bytes)
    • l long word (4 bytes)
    • q quad word (8 bytes)
  • addb does integer addition using a byte data width
  • Operands for x86-64 are one of three types:
    1. Immediates: constant integer data, uses $ prefix
    2. Registers: name of any of the 16 general purpose registers, uses % prefix
    3. Memory: specific address, ususally dereferenced, parenthesis before a %
  • For binary instructions (two operands), the operations can be used in any combination except:
    • Immediate cannot be used as a destination operand
    • No memory to memory operation (both operations cannot be Memory)
  • A register is a location in the CPU which stores small amount of data (word size) which can be accessed quickly
    • Fixed number per architecture
      • x86-64 has 16 with 8-byte word size
    • Referred by name, different for smaller divisions
    • Some registers are reserved
  • Register names referring to smaller divisions always refer to least significant bytes of the register
    • %ax refers to the lowest two bytes of %rax
  • When a smaller register name is used, the upper bytes are untouched except:
    • Any instruction that generates a 32-bit value for a register also sets the higher bits to all zeros (x86-64)
  • General way to express Memory operand: D(Rb,Ri,S)
    • D: displacement value, must be an immediate or constant
    • Rb: base register, name of the register which acts as the “base” of our address calculation
    • Ri: index register, name of the register whos value will be scaled and added to the base
    • S: scale factor, scales the value in Ri by the specified number, must be 1, 2, 4, or 8
  • The computed address is: Reg[Rb] + Reg[Ri] \* S + D where Reg[] means “the value in the register specified”
    • Most instructions will dereference this, giving the memory of this
  • Default values of Memory operand:
    • D = 0
    • Reg[Rb] = 0
    • Reg[Ri] = 0
    • S = 1
  • GCC and Clang can compile down to x86-64 or ARM instruction sets
  • Assembly doesn’t have “data types”, only byte amounts
  • “Moving” data actually is more similar to copying data

Lecture 8 - July 10

  • Calling conventions: for assembly functions
    • We store first argument in %rdi
    • We store second argument in %rsi
    • Return value is stored in %rax
  • Addressing modes:
    • We can use parenthesis to dereference pointers i.e. (%rax)
  • Remember registers can store pointers as well!
  • leaq src, dst: load effective address
    • Breaks the rules :/
    • src is address expression
    • dst is a register
    • Sets dst to the address computed by the src expression
      • Only does math, doesn’t do memory stuff
  • As multiplication is an expensive operation, it is often broken down to bitshifts and additions
  • %rip stores a pointer to the next instruction we will execute
  • There are one bit “condition codes” which store one-bit of memory and are implicitly set
    • CF carry flag
    • ZF zero flag
    • SF sign flag
    • OF overflow flag
  • Compare cmpq a, b sets condition codes by doing b - a
  • testq a, b does the same, setting codes with a&b
    • testq a, a allows us to test based on an existing value
  • jmp is unconditional
    • Updates %rip
  • set* sets the lowest byte of a register to 0 or 1 based on current condition codes

Lecture 9 - July 12

  • Extension instructions: movz and movs
    • Similar to the mov command, but source operand is shorter than the dest operand
    • Performs zero extension and sign extension
    • Requires two width specifier suffixes
  • Conditionals are typically made up of two instructions:
    1. Logical instruction that sets/changes the condition codes (cmp, test)
    2. A conditional jump instruction (j*)
  • Operand to a jump instruction is a label
    • Name followed by a colon, i.e. main:
    • Labels are symbolic, representing the location of the next instruction found
  • Loops are made by using a jump statement to go backwards to beginning of loop body
    • Conditionals can be at the top or bottom of the loop
  • Switch statement branches condition depending on a value
    • Need to break to prevent falling through cases
    • Jump table has an “array” of pointers to code
    • Those code blocks are contiguous
  • ja: jump to the default case (unsigned case)
  • jmp *.L4(,%rdi,8): jump table, index starting at .L4

Lecture 10/11 - July 17

  • The stack is a section of memory which takes up the highest possible address space
    • Holds local variables and procedural context
    • It grows downwards, occupying lower addresses as it needs more space
    • The end of the stack is indicated by the address stored in the stack pointer %rsp
  • Stack pointer can be modified by subq and addq, allocation and deallocation
    • Doesn’t affect any data stored in memory
  • Stack pointer can also be modified with push src and pop dest
    • push decrements %rsp then copies data from source to %rsp
    • pop copies data from %rsp to dest, then increments %rsp
  • Calling conventions: ensure proecdures can pass data and control one another
    • We store a return value on the stack, pointer to what the caller should execute next
    • To pass control to another procedure we use the call label instruction
    • To return control we use the ret instruction
  • To pass arguments we must use specific registers
    • “Diane’s Silk Dress Cost $89” is a memonic
    • All other arguments must be placed on the stack in reverse order
  • The return value must be placed in %rax
    • This could be a pointer to the return value
  • Stack frames are a way of enabling recursion, dividing our stack into chunks
    • Each of these frames are for only one call
  • Stack frame layout:
    1. return address from call
    2. (optional) %rbp frame pointer, indicator of the beginning of the current frame
    3. Old register values need to be pushed, followed by local variables
    4. If this calls another procedure, more register values and arguments 7+
  • As any procedure can overwrite registers, we have register saving conventions
    • Calle-saved registers it is the callee’s responsibility to save the old value before using the register
      • Restores the old value before returning
    • Caller-saved registers it is the responsibility of the caller to save the old value before passing control

Lecture 12 - July 19

  • A buffer is a region of memory used to temporarly store data
  • A buffer overflow is writing past the end of the buffer into adjacent memory
  • Stack smashing is writing past the end of a local array in the stack
    • We can write back enough in the frame to overwrite the return address
  • gets() can write user input past the end of an array
  • Code injection:
    1. Write exploit code/binary and put it at the start of the buffer
    2. Pad to the end of the stack frame
    3. Replace return address with buffer address (exploit code)
  • Good practice when exploiting someone’s code not to mess with past stack frames, so normal operation can be resumed

Lecture 13 - July 21

  • Process of building and running an executable has four phases - CALL:
    • Compiling
    • Assembling
    • Linking
    • Loading
  • Building an executable is the first three phases
    • This is done by gcc
  • Compiling:
    • Compiler translates a text file of source code to a text file of assembly
      1. In C, preprocessor step which runs command that start with #
      2. Interpret the meaning of the code and convert to assembly
    • We can supply compiler optimization flags to gcc to change certain attributes
  • Assembly:
    • Assembler converts a text file of assembly code to a binary object file
      • These contain object code which is incomplete machine code
      • The object code is incomplete because it lacks the addresses associated with the labels of the final executable
    • In order to patch the object code, we create two tables:
      1. Symbol table: list of globally-accessible labels
      2. Relocation table: list of the addresses to be patched
  • Linking:
    • The linker stiches together the object and static library files necessary to build the final executable
    • Links all the symbol and relocation tables
    • “Unresolved symbols” (i.e. undefined function) will lead to failure
  • Disassembling:
    • Reading binary code and interpreting it as instructions is disassembling
    • ghidra my beloved
  • Loading:
    • The loader takes an executable and starts a running process
    • Sets up memory sections and initializes register values
    • Primarily handled by the OS
  • Arrays in assembly:
    • The name of an array arr in C is turned into a placeholder for the starting address of an array
    • Array subscription (arr[idx]) in assembly is: D(,Ri,S) with D the array name/label, Reg[Ri] the index, and S = sizeof(T)
  • Multidimensional arrays are a contiguous chunk of memory for all arrays
    • Accessing an element is an address calculation followed by a single memory acceess
  • Multilevel arrays do not require a large contiguous block of memory, but extra memory accesses
    • Stores pointers vs arrays, allows variable array sizes

Lecture 14 - July 24

  • A struct in C is a user-defined, structured group of variables
    • Defining a struct: struct struct_tag {type1 field1; ...}
    • Declaring a struct variable instance: struct struct_tag my_struct_var;
      • The initial struct does not need to be written if you typedef it
    • Possible to also declare and intialize a struct at once, use *ptr to declare pointer
  • Struct name can (typically) only be used after it was defined and in scop
  • Used to allow compiler or program to understand the size and layout of the struct
  • A struct can have fields of any type except itself (but can have pointers to itself)
  • Fields are accessed with . operator or -> from a pointer
  • As two word data type names are unweildy, we often combine them with typedef:
    • i.e. typedef unsigned int uint;
    • This can be done with structs after defintion or while defining:
      • typedef struct {type1 field1; ...} struct_tag;
  • A primitive object of K bytes in memory is aligned if address % K = 0
  • The struct layout follows the ordering of the fields with padding inserted such that each field is aligned
    • The padding (unused space) is known as internal fragmentation
  • The overall size of the struct also needs to be alligned:
    • overall_size % K_max = 0 where K_max is the largest type in the fields
    • This padding is called *external fragmentation`

Lecture 15 - July 26

  • IEC prefixes use a base 1024 and refer to large numbers in computing
  • Caches ($) are intermediate places of storing memory for a short time
    • Data is transferred between caches and memory in blocks, machine-specified units larger than a word
  • When accessing memory, the machine checks the cache first:
    • If the data is in the cache, that is a cache hit
    • Otherwise it must be fetched from memory and stored in the cache, a cache miss
      • Placement policy determines where the block is placed
      • Replacement policy determines what memory is removed from the cache
  • The principle of locality: programs tend to use data at addresses near to what was used recently
    • Temporal locality: referenced items are likely to be referenced in the near future
    • Spatial locality: items with nearby addresses tend to be referenced close together in time
  • Cache performance metrics:
    • Hit Time: how long a cache hit takes
    • Miss Penalty: how long it takes to fetch a block of data from memory
    • Hit/Miss Rate: the fraction of memory accesses that result in a hit or miss respectively
    • Average Memory Access Time = hit time + miss rate * miss penalty
  • There are typically multiple levels of caching
    • Some are closer (quicker) to memory
    • Block size might differ between cache levels
    • AMAT is calculated for overall system caching

Lecture 16 - July 28

  • Caches are one of the quickest parts of the memory hierarchy
  • Cache size is the capacity of the cache, always a multiple of the block size
  • The block number of A is simply: A // K
  • The block offset of A is: A % K
  • Direct-mapped placement strategy: hash and index the blocks
  • Higher types of memory on the hierarchy cost more (money)
  • Caches act as caches for each other on different levels
  • On a CPU, each core has a d-cache and i-cache
    • These are L1 level
    • d-cache is for data, what we typically focus on
    • i-cache is for instructions, such as %rip
  • We then have a L2 unified cache between the two
  • Below, we have a L3 unified cache between all the different cores
  • We hash addresses for caches
  • The least significant bits from the address map to the cache
  • As we overwrite the cache blocks with new ones, we need a way to determine which memory value it is from
    • We use the leading (remaining) bits from the address and store them in the tag field
    • The tag field is not part of the memory size
  • TIO address breakdown:
    • Index: tells you where to look in cache
    • Tag: checks if the block is the correct block
    • Offset: finds the byte you’re looking for

Lecture 17 - July 31

  • Repeating patterns of block access which are offset can lead to no relevant data being stored in cache
  • Associativity: we can now have an arbitrary number of blocks per set
    • Kinda like a fixed-length linked list hashmap
    • Direct-mapped caches are $E = 1$
  • We need a valid bit to ensure that the data is not randomly initialized
  • When replacing data in a cache, we typically replace the least recently used (LRU)
    • On hardware, this is often not most recently used
  • Three types of misses:
    • Compulsory miss: first access to a block
    • Conflict miss: multiple data objects map to the same slot
    • Capacity miss: the active set of blocks of memory is larger than the cache

Lecture 18 - August 2

  • There is the issue of writing to memory with caches
  • Write-hit: data already in cache
    • Write-through: write immediately to next level
    • Write-back: defer write to next level until line is evicted
      • Requires us to keep track of what has been modified, using the dirty bit
  • Write-miss: data not in cache:
    • Write allocate: load into cache, then execute the write-hit policy
    • No-write allocate: simply write to next level immediately
  • Caches are incredibly important for matmuls
    • We can block the matricies into sub-matricies and matmul them - see linalg textbook
    • We want to make sure that $3r^2 < C$
  • Always use reasonably small set of data (spatial locality)

Lecture 19 - August 4

  • Static global data: fixed size at compile time, exists for the entire runtime
  • Stack allocated data: size known at runtime, no resizing
    • It exists only for the lifetime of a function
  • Dynamically allocated data: between the two extremes
    • Use malloc and free to do so
  • Explicit allocation: programmer allocates and frees space
    • C or other low-level languages
  • Implicit allocator: programmer only allocates space
    • Java, Python, high-level languages
  • %brk points to the top of the heap, which grows up
    • When we need more space, we increment %brk
  • malloc
    • Returns a void*: an arbitrary pointer
    • Takes in a size t size, a long representing an amount of memory
    • Returns null a pointer to allocated memory or NULL if failed
  • Good practice is: (int*) malloc(n*sizeof(int));
  • calloc: malloc, but zeros all of the data
  • realloc: changes the size of a previously allocated block
  • sbrk: function to increase or shrink the heap (change %brk)
  • free
    • Takes in a pointer obtained by malloc
    • Returns the space to the heap
    • No action occurs if you call free on NULL
  • malloc allocates in words of 8 bytes
  • Internal fragmentation happens when payload is smaller than the block
  • External fragmentation: unused/unallocated memory between blocks in the heap
  • We keep a header field for all allocations
    • This stores the size of the block in a word
  • As all of the sizes are multiples of 8, the last size bits are always zero
    • Thus, we can use the last bit to store if it is free or not

Lecture 20 - August 7

  • x86 typically uses 16-byte alignment
  • One-word marker is the end of the list: size zero but allocated
  • First fit allocation:
    • Start at the beginning of the list, then jump until we have a free block of correct size
  • Next fit allocation:
    • Start the search where the previous search finished
  • Best fit allocation:
    • Search through the entire list, return the free block with closest size
  • When we allocate in a free block with not excess space, we need to split it
    • This means we add the correct size for remaining free blocks
  • When freeing data, we could just zero the last bit
    • This leads to the problem of false fragmentation
    • Instead, whenever we free a block we coalesce with the next free blocks
      • This works easilly with the next free blocks
  • To coalesce backwards, we store the size in a word after the payload as well
  • Explicit free lists:
    • Allocated block stays the same
    • Free blocks store a next and previous pointer in memory, where the payload would be
  • This makes up a doubly-linked list
    • The blocks are not necessarily sequentially linked in memory
  • Explicit free list allocation policies:
    • LIFO: insert block at the beginning of the free list
      • Simple and constant time performance
      • Bad caching performance
    • Address-ordered policy: insert free list blocks in address order
      • Requires linear time search
      • Improved caching performance
  • We don’t need the proceding boundary tag if the block is not free
    • We can not use the tag by keeping an extra flag, is preceeding block free
  • Comparisons between explicit and implicit free lists:
    • Explicit free lists are faster when most of the memory is full
    • Explicit free lists are more challenging to implement
    • Implicit free lists require less pointers, leading to a smaller minimum block size
    • Explicit free lists are most commonly used with segregated free lists
      • These are different lists of multiple sizes or object types
  • Minimum block size:
    • When we split a block, we need to know if the leftover is worth it
    • For an explicit free list, we need 32 bytes
      • 4 words which are pointers and sizes for free blocks

Lecture 21 - August 9

  • We use exceptional control flow to manage multiple processes at a time and low-level mechanisms
    • Exceptions (hardware)
    • Signals (OS)
  • Typical control flow is sequential and reacts only to program state
    • This can’t handle modern challenges, i.e. system interupts or program stalling
  • Kernel: a core component of an operating system, privileged instructions
    • The lowest layer and security mechanism
  • For managing exceptions, we use an exception table
    • This classifies different exceptions into their behaviors/responses
  • Asynchronous exceptions:
    • Interrupts: caused by events external to the processor
      • e.g. Timer interrupt, a interrupt allowing the OS to take back control of programs
  • Synchronous exceptions:
    • Traps:
      • Intentional way to transfer control to the OS to perform a function
      • E.g. syscalls
      • Returns control to the next instruction
    • Faults:
      • Unintentional but can be recovered
      • E.g. divide-by-zero
      • Either re-executes the instruction or aborts
    • Aborts:
      • Unintentional and unrecoverable
      • E.g. hardware failure
      • Aborts the current program
  • Each program gets its own process, an illusion that a processor is entirely for itself
    • A process is an interface between the program and hardware
    • Programs get their own logical control flow
    • It gives programs a private address space
  • The OS tells each processor when to switch between programs
  • To switch between processes, we need to store execution context
    • This is context switching
    • It saves current registers, schedules the next process, and loads the saved registers and address space
  • Two processes run concurrently if their instructions run at the same time
  • Kernel code manages processes as part of a user process, switching between them
  • In Linux, when creating a new process:
    • First, the current process is fork()‘ed
      • fork() creates a copy of the current process
      • It returns 0 to the child process and the child’s process ID (PID) to the parent
    • Second, exec*() is called on the new process from the fork
      • exec*() replaces a current process’ code and address space with that for another program
  • A process graph is a way to visualize executions and processes

Lecture 22 - August 11

  • exec*() functions loads a fresh instance of the specified program, the “Loader” from “CALL”
  • Execution context:
    • Data
      • Static data and executables are copied from the program executable
    • Code
      • Copied from the program executable
    • Heap
      • Empty: no allocated memory could have yet been requested
    • Stack
      • Setup the main stack frame and command-line arguments
    • Registers
      • main takes two inputs, %rdi and %rsi
      • %rip and %rsp need to point to main and the stack
  • Ending a process - three typical ways:
    1. Executing a return statement from main
    2. Calling the library function exit()
    3. From an abort by the exception handler
  • 1 and 2 provide a status code that can be read by the parent process
  • A terminated process must be reaped to read the status code and delete resouces
    • A process which is terminated but not yet reaped is a zombie process
  • Reaping is done by the parent process can can be done explicitly or implicitly
    • Explicitly: using wait() or waitpid() calls
    • Implicitly when the parent process terminates
  • When a parent process terminates before a child process, responsibility for reaping is given to init or systemd
    • These have a PID of 1
  • Abstracting memory for specific programs is known as virtual memory
  • Virtual address space: the amount of bytes a process thinks is available
  • Physical address space: the amount of bytes that is physically available on RAM
  • There is also extra space allocated on disk, called swap space for excess temporary memory
  • The computer needs to map the virtual address space into physical address spaces

Lecture 23 - August 14

  • The memory management unit (MMU) maps virtual addresses to physical addresses
  • An n-bit CPU generated address is split into virtual page number and page offset
    • The physical and virtual offsets are aligned
  • We use a page table lookup table
    • We replace virtual page numbers with physical page tables to generate physical addresses
  • PTE: page table entry (PPN + management bits)
  • Page tables store PPN and a valid bit (is the page in RAM?)
    • If the valid bit is zero and the data is null, it is not mapped at all
    • If it is zero and data is not null, the page is stored in disk (or other virtual memory)
  • Each process has its own page table
  • When we switch between processes, we also switch the page table (along with other parts of memory)
  • Page hit: VM reference is in physical memory
  • Page fault: VM reference is not in physical memory
  • There are also read, write, and execute bits for the data

Lecture 24 - August 16

  • We’re done!