Lecture 5
last time - showed example of how same code could produce different results when run different times. result is no longer deterministic
#include<iostream>
#include<thread>
void Greet(std::string name) {
std::cout << "Hello from " << name << std::endl;
}
int main() {
std::thread t1{ Greet, "t1" };
std::thread t2{ Greet, "t2" };
std::cout << "Hello from original" << std::endl;
}
this caused error when we ran:
- when original process finishes, whole program finishes
- OS complains that other threads were not terminated manually - t1 and t2 are still trying to run, so "abnormal termination"
to make computer happy, do it in the correct way:
- create thread, and pause & wait for thread to finish
#include<iostream>
#include<thread>
void Greet(std::string name) {
std::cout << "Hello from " << name << std::endl;
}
int main() {
std::thread t1{ Greet, "t1" };
std::thread t2{ Greet, "t2" };
std::cout << "Hello from original" << std::endl;
// join: stop here, and wait for thread to terminate
// (if thread is done already, no pauses are introduced)
t1.join();
t2.join();
}
the join
method is the first example of synchronization: we are controlling the order of the thread termination
the above code does not cause any error messages, because we are terminating the threads explicitly, so no chance that threads are still running when main thread finishes
typical example of race condition:
#include<iostream>
#include<thread>
int glob{ 0 };
void Foo(std::string name) {
for (int i{0}; i < 100'000; i++)
glob++;
}
int main() {
std::thread t1{ Foo };
std::thread t2{ Foo };
t1.join();
t2.join();
// less than 200k due to race condition
std::cout << glob << std::endl;
}
global variable is shared between multiple threads, because it is in the data section
issue in above code: multiple threads accessing and modifying same variable
race condition: "data race" between two or more threads
- have
glob
variable, initialized to 0 - thread 1 and thread 2 both try to do
glob++
- for the sake of discussion, these threads run in parallel on two different cores
- since we don't know which core/which thread will do
glob++
first, we don't know what order instructions will be executed in - in reality,
glob++
instruction is as follows (mips assembly):
# assume addr in t0
lw $s0, 0($t0)
addi $s0, $s0, 1
sw $s0, 0($t0)
- same thing will happen in both threads, so one thread could read a stale value, and then overwrite the saved value, effectively skipping an increment
- for example: registers for each thread receive 0 as the first value from the load word instruction, then add 1, and then each writes back - regardless of which is first, the value in
glob
is still 1 and not 2 - same/similar scenario when multitasking on same core - context switch could occur before last instruction runs on one of the threads, which would cause the same issue
- in this case, context switch is likely because of memory access
- could also not run into an issue - possible that tasks happen to run in series (execution does not overlap), in which case
glob
would have 2 - to avoid data corruption, want to access shared data only one process/thread at a time. shared resources should only be accessed by one entity at a time
- name for pieces/fragments of code that should not overlap in execution: critical sections. need to prevent execution of critical sections from overlapping
so, definition of race condition: bad scenario where we corrupt shared resource/data because of simultaneous access by multiple threads or processes, when at least one tries to modify a shared resource
(however, if only read-only access, then not an issue - only an issue when modification occurs)
"nastiest kind of errors" because they are not reliably reproducible
if only one thread modifying and other reading, can still cause issues:
- may need several operations to modify value (e.g., updating first and second halves of the integer separately)
- will cause data fetched by reading thread to be corrupted
- only an issue if multiple operations are needed to modify the value as needed
- but should avoid in general - e.g., different hardware could have different results
- compiler may actually help with this - c++ has "atomic" class that helps to avoid race conditions
- compiler can tell whether hardware permits "atomic" modifications
simple way to avoid race condition: "locks"
- have global shared data/device/file
- associate one extra shared variable: "lock"
- "mutex" lock - mutual exclusion
- simple integer, or boolean variable
- two states: "acquired" or "released", or in cpp, "locked" and "unlocked"
- all processes that deal with shared resource need to be aware that there is a shared lock associated with this resource
- all threads need to be acquire/lock the resource before using it
example: mutex locking:
#include<iostream>
#include<thread>
#include<mutex>
int glob{ 0 };
std::mutex m;
void Foo(std::string name) {
for (int i{0}; i < 100'000; i++) {
// acquire the mutex m
// scenario 1: if lock is free, close it
// and proceed further
// scenario 2: if lock is not free,
// freeze and do not proceed further until
// lock is released (and when it is released,
// lock again)
m.lock();
// critical section of code
glob++;
// once finished, open the lock to allow others
// to use the variable
m.unlock();
}
}
int main() {
std::thread t1{ Foo };
std::thread t2{ Foo };
t1.join();
t2.join();
// should be 200k due to avoiding race condition
std::cout << glob << std::endl;
}
above is not an efficient solution, but is a solution nonetheless
the above negates the benefit of threads because all work is done in series and not parallel