c++ - Porting threads to windows. Critical sections are very slow -


i'm porting code windows , found threading extremely slow. task takes 300 seconds on windows (with 2 xeon e5-2670 8 core 2.6ghz = 16 core) , 3.5 seconds on linux (xeon e5-1607 4 core 3ghz). using vs2012 express.

i've got 32 threads calling entercriticalsection(), popping 80 byte job of std::stack, leavecriticalsection , doing work (250k jobs in total).

before , after every critical section call print thread id , current time.

  • the wait time single thread's lock ~160ms
  • to pop job off stack takes ~3ms
  • calling leave takes ~3ms
  • the job takes ~1ms

(roughly same debug/release, debug takes little longer. i'd love able profile code :p)

commenting out job call makes whole process take 2 seconds (still more linux).

i've tried both queryperformancecounter , timegettime, both give approx same result.

afaik job never makes sync calls, can't explain slowdown unless does.

i have no idea why copying stack , calling pop takes long. confusing thing why call leave() takes long.

can speculate on why it's running slowly?

i wouldn't have thought difference in processor give 100x performance difference, @ related dual cpus? (having sync between separate cpus internal cores).

by way, i'm aware of std::thread want library code work pre c++11.

edit

//in while(hasjobs) loop...  event qwe1 = {"lock", timegettime(), id}; events.push_back(qwe1);  scene->jobmutex.lock();  event qwe2 = {"getjob", timegettime(), id}; events.push_back(qwe2);  hasjobs = !scene->jobs.empty(); if (hasjobs) {     job = scene->jobs.front();     scene->jobs.pop(); }  event qwe3 = {"gotjob", timegettime(), id}; events.push_back(qwe3);  scene->jobmutex.unlock();  event qwe4 = {"unlock", timegettime(), id}; events.push_back(qwe4);  if (hasjobs)     scene->performjob(job); 

and mutex class, linux #ifdef stuff removed...

critical_section mutex;  ...  mutex::mutex() {     initializecriticalsection(&mutex); } mutex::~mutex() {     deletecriticalsection(&mutex); } void mutex::lock() {     entercriticalsection(&mutex); } void mutex::unlock() {     leavecriticalsection(&mutex); } 

window's critical_section spins in tight loop when first enter it. not suspend thread called entercriticalsection unless substantial period has elapsed in spin loop. having 32 threads contending same critical section burn , waste lot of cpu cycles. try mutex instead (see createmutex).


Comments

Popular posts from this blog

java - activate/deactivate sonar maven plugin by profile? -

python - TypeError: can only concatenate tuple (not "float") to tuple -

java - What is the difference between String. and String.this. ? -