Imagine I’ve ordered 8 items from an online store. In the warehouse those items are stored in different sections that might be far away from each other. If we have one person working at the warehouse he has to go through my list one by one and collect each time – walking back and forth across the warehouse as he does it. Finally he loads it all up in the truck.
The alternative is to hire 8 people – they split up the warehouse into 8 sections and each of them is responsible for bringing things from that section to a sorting bay where they are loaded into trucks.
That’s called a parallel problem. We have a set of tasks that could be done at the same time so we split the work into parallel processes.
On the other hand, when the driver does his delivery I need to be home at the same time in order to accept it. This is called a concurrent problem. If I’m not home he could just sit there and wait for me but then he would get about 1 delivery done per day (averaging out for all the times people just never came home on a given day).
In our concurrent problem the driver then leaves a note and tells me he will drop my packages off at the distribution centre at 6pm. If I get home early I have to wait till 6 before I can attempt to claim my packages.
Suppose the delivery driver left my packages on my front step – and then someone stole them. In programming we don’t usually see the process behind some outcome – we only see the end result. We could imagine taking a meta view of our universe and hitting “play”. When the universes finishes playing we can see me wearing a new shirt that I just ordered – thinking fantastic my universe executed just as I expected.
Diligent as we are, we decide to run the universe just one more time to make sure it all works properly. This time we look at the end result and I’m wearing some other shirt. But how is that possible? I got the delivery, I must have the shirt, why am I not wearing it? I look at the log from the delivery driver and it even says he delivered the packages – one of which contained the new shirt. So why is it not being worn?
So we start to debug our universe – we arrange for everything to pause when the package gets delivered. We see that it arrives at 12pm – and there’s no point waiting 6 hours watching the package at the door, so we hit play again and see that I am wearing the shirt this time. In fact only 1 in 10 times I’m not wearing the shirt.
We put up a check when I get to the front door to see if I take the packages inside and open them. Every time I end up not wearing the shirt there are no packages at the door even though we can see that they were delivered.
This painstaking debugging process is what it is like building “concurrent” programs. Anything that has multiple parties interacting across an internet connection needs to run concurrently. If a program is large enough different parts might need to run on different servers communicating concurrently.
We can pre-empt problems by putting in “locks”. These locks are like leaving a card if I’m not there and collecting from the distribution centre later. But in 9/10 cases where the package doesn’t get stolen that is so much slower. In order to fix our concurrent problem we have to cause delays and slowdowns.
Things don’t get much easier with our parallel problems. If one person in the warehouse takes 8 hours to do the work, we should expect 8 people to do the work in 1 hour. Except that one person might spend all day doing nothing while another person is in a really busy part of the warehouse and has to work for 3 hours. So now we are paying for 8 people for 3 hours – at a cost of 24 man hours – instead of one person for 8 hours. Sure it’s done quicker but at 3x the cost.
But we also have to hire a manager to supervise the team and make sure the work gets delegated appropriately – which is a 9th more expensive resource. And what happens if one of the warehouse team calls in sick? Do we wait a day for them to come back and fulfill the previous day’s orders? Or do we have a protocol in place for re-structuring the work for just that one day?
What if, for each delivery, we just split up the items and gave one to each person on the team? When five items are really close to each other but the last 3 are at other ends of the warehouse we are inefficient.
Now you need some sort of procedure that works out the best approach for splitting the work. And it also needs to account for the possibility that someone might be not be available, that you need to hire a manager to delegate the work and then there’s the extra cost in hiring more people.
Real life scheduling problems are far more nuanced because there is a human element. In computer programs the biggest factor is speed. These decision happen over nanoseconds or milliseconds. If an algorithm is inefficient it can go from executing in less than a second to executing in several seconds. Imagine the compounded effects – your parallel process is inefficient which causes issues in some concurrent process.
In a world where we rely in computer programs to execute efficiently these sorts of issues cause more than just delays. Concurrent problems can cause unexpected behaviours. As we saw these unexpected behaviours are harder to diagnose than simple errors generated by standard sequential programs.
As I’m writing this I have:
- A Haskell program executing in the background printing items on a console
- Google docs periodically “auto” saving what I’ve written
- I have another window open with Youtube playing music
- In another tab I have Facebook open which might randomly receive a message from someone
- And through all of this my computer is running an operating system with who knows how many background processes allowing me to type, see, and hear everything
Finally you might have noticed something convenient in my analogy – there are 8 hours in a day, I ordered 8 items, the warehouse had 8 sections and we hire 8 warehouse staff. This would be incredibly handy for a computer as everything works in powers of two. Real world problems are rarely so simple. For more, read my article about computational complexity and P vs NP.