5 minute read

UCL Course COMP0133: Distributed Systems and Security LEC-02


I/O Concurrency in One Process

Filesystem

  • read-ahead into disk buffer cache

  • write-behind from disk buffer cache

Networking code

  • copy arriving packets into applicaiton’s Kernel socket buffer for read()

  • copy application’s data into Kernel socket buffer for write()


I/O Concurrency with Multiple Process

Idea

Start a new UNIX process for execute operations (each connection / request)

Method

Master process assigns new connections to child processes using fork() (Case Study: server_2())

Pros

  • Simple software structure

  • Isolation: bugs in one process will not affect other processes (OS Level, hardware still shares)

  • “Free”: if have more CPUs, each process may run on one CPU (CPU concurrency)

Cons

  • fork() method may be expensive (memory for new process, cost 300 us minimum - pre fork())

  • Isolation: memory not shared (across processe operations may be harder and costy)


CPU Concurrency

I/O concurrency tools often help with CPU concurrency

Cons:

  • More work for OS designers (using locks and unlocks)

  • Less important: 2X faster in CPU concurrency (100X in I/O concurrency)

  • Hard to program to get good scaling

Maybe buy 2 machines will be much easier


Concurrency with Threads

Similar to multiple process but to solve the shortcomings

Difference:

  • All threads share the same process memory (one address space)

  • Each thread has its own stack inside the process memory (keep tracking procedure call)

Drawbacks:

  • Programmer needs to use locks and unlocks: use where they need and in right way

    if whole process is blocked because only one thread blocks, no concurrency.

  • One thread can corrput another if locks and unlocks are used incorrectly


Kernel-Supported Threads

Kernel knows each thread state: if one thread blocks, schedule another thread

Kernel Requriements

  • Per-thread Kernel stack (system calls)

  • Per-thread tables (seperate live registers)

Pros

  • Kernel can schedule on thread per CPU (good for CPU concurrency & I/O concurrency)

Cons

  • Expensive:

    Help to create operation (system call)

    Help to context switch (page fault)

    Frequent locks and unlocks must invoke Kernel (system call)

  • Not Portable

    Implementation heavily tailored to each OS

All modern OS support Kernel Threads duet to performance (e.g., data center)


User-Level Threads

Purely inside user process (need scheduler within user process)

Requirments

  • Know when thread makes blocking system call

  • Switch to another runnable thread (not block process)

  • Know when I/O is done and wake up the original thread

Thread Library (pthread.h in C is not a user-level thread library)

  • contains fake system calls (e.g., read(), write(), accept() …)

  • is able to start non-blocking system operations

    return immediately from Kernel without data come back (need other event notification mechanism)

  • marks threads as waiting, and switches to runnable thread

  • Kernel notifies the I/O completion

  • marks waiting threads to runnable

fake_read()
{
    Tell kernel to start read
    Mark thread waiting for read
    sched()
}

sched()
{
    Ask Kernel for any I/O completion events
    Mark corresponding threads runnable
    Find runnable thread by scheduling policy
    Restore registers and return
}

Cons

  • User-level threads need significant Kernel support

    non-blocking system calls, uniform event notification mechanism

  • Hard to implement non-blocking system calls

    Need keeping state to let Kernel know where to continue in multi-step operations

Event Notification

Events Thread Library needs from Kernel:

  • New network connection

  • Data arrived on socket

  • Disk read completed

  • Socket ready for further write

In UNIX (partly supported):

New TCP connections, arriving TCP/pipe/tty data: Yes

Filesystem operation completion: No

Some system calls do not have non-blocking version (cannot start without waiting)

  • connect(), read(), write() on socket: Have

  • open(), stat(): Not Have

  • read() from disk: Sometimes


Event-Driven Programming

Aim

Overlap only wait for I/O competition (do CPU sequentially)

Method

Threads may not be a good approach (suffer from Locks and Unlocks)

Since events for I/O usually occur one-at-a-time (organize software around event arrival)

Write software in state-machine style

when seeing an event happens, call the handler (event-driven programming)

Properties (Desirable)

  • Serial nature of events preserved

  • Programmer sees only one event/function at a time (no Locks and Unlocks needed)

Stack Ripping (Undesirable)

In normal programming, debugger will told you where the bug is using a stack form (sequential algorithm)

In event-driven programming, difficult to locate in a long execution for debugging or rewriting


Worse is Better

MIT Approach

Correctness = Consistency > Completeness > Simplicity

New Jersey Approach

Simplicity > Correctness = Consistency > Completeness

Case Study: PC Losering Problem

Interrupt happens when a user application invokes a system call

MIT: stop the system call, then resume it in an appropriate place

New Jersey: do nothing, then return a error message to ask user to execute this system call again

Lesson

Think about which approach the system selects and why during this course