‘Stupid’ Question 23: What is a thread? And what is multithreading?
[To celebrate my first year of programming I will ask a ‘stupid’ questions daily on my blog for a year, to make sure I learn at least 365 new things during my second year as a developer]
What is a thread? And what is multithreading?
A simple question today, to make up for the heavier one yesterday :)
What is a thread?
A lightweight process that performs a sequence of steps (each step executes a line of code). Due to the step by step nature the process time will be the time it takes to take those steps.
What is multithreading?
When you execute multiple threads at the same time
Leave a comment (via email)
I think this simple definition belies the massive complexity multithreading and asynchrony can add to almost any programming solution beyond the obvious basics. E.g.: "Love is a feeling of emotional attraction between two humans" - seems simple enough, right? The problem comes when trying to implement multithreading (and love!) in real scenarios, and I think that warrants it's own discussion... or maybe it's own question? I'd love to see an extended discussion on this.
Belies: "To give a false representation to; misrepresent" I know threading is complex, and that is why I wanted to give a simple get-started-definition,- hoping devs will join in on a discussion. You might have noticed, and joined in yourself in the discussions on the previous questions and therefore hopefully see that I don't try to oversimplify or imply that I am an expert. What i am thinking is that I am allowed to ask, and try to answer, even if I am not an expert or can provide a big discussion on the subject. Or is it so, that only the elite is allowed? I disagree with your statement that I give a false representation, or a missleading one. And please notice that the question does not include the problems that can come with multithreading such as race conditions and deadlocking etc. Threading is a scary subject for many devs, and by giving information a spoonfull at the time I am hoping to feed complex information to others and myself in a pace that one never gets 'full'. I would also love an extended discussion, and many can be found on for example MSDN and Stackoverflow. And I have read them, and I am reading them. This is just my way of serving information, and I know that some will not like it- but it is more important for me those that do. It is often newbies/ n00bs or whatever we would like to call them. devs like me trying to fit in.
Great primer for understanding threading is the article Process and thread fundamentals. I've included an excerpt of the full article below. Single threaded If you've ever lived on your own, then you know what this is like — you know that you can do anything you want in the house at any time, because there's nobody else in the house. If you want to turn on the stereo, use the washroom, have dinner — whatever — you just go ahead and do it. Multi threaded Things change dramatically when you add another person into the house. Let's say you get married, so now you have a spouse living there too. You can't just march into the washroom at any given point; you need to check first to make sure your spouse isn't in there! If you have two responsible adults living in a house, generally you can be reasonably lax about “security” — you know that the other adult will respect your space, won't try to set the kitchen on fire (deliberately!), and so on. Now, throw a few kids into the mix and suddenly things get a lot more interesting.
You are, of course, correct! And I apologize if my post sounded negative our elitist in any way. I'm positive you aren't trying to misrepresent anything to anyone. I'm also not trying to discredit you or this (great) Q&A series in any way. :) Multithreading is one of the things that I feel least comfortable doing. For that very reason I simply wanted to ignite a discussion on the issues with it. Perhaps my wording was too callous though - I'm sorry if that's the case.
Anders: I just don't want to scare the living daylight out of new programmers. Some things become second nature as we get better, and we forget how scary it was/is. Threading is something many devs avoid, which is a shame because it is an important and very interesting subject both for desktop and web development. Fear is not a good learning tool, it makes you supress information given at that time. But good feelings, like 'hey- this ain't to bad. I understand this, this is easy!' will stick. And little by little we will add complexity without the student even noticing. People ask me how I can run a marathon. I say one step at the time. That one step is so easy. And so is the next one. after 42 Km it was an easy run. Whenever it got hard, i just had to take one more step. So the questions are just that, steps. One question at the time, one step at the time, we all become marathon developers. I am not at least comfortable with threading, I can make things happen, but I am never 100% what is really happening in the background. But understanding the very basic, what it is, I reckon that will on it's own spark new questions and a dev can see what potential problems might be. And I think that even if the answers are very short, they do answer the what question. Now, why and how - that is a big and hard discussion,- one that i am not comfortable to lead at this moment.
I believe the name "thread" comes from a shortening of "thread of execution", which itself is a personification from "Thread of Life"
Locks aren't already necessary aren't the only answer. There's a great post about lock-free algorithms. There's also things like 0mq for communicating between threads.
As an answer to the question I would like to provide a little history of multi-threading to perhaps provide a broader perspective of what threads are. In the early days of computing some of the more powerful operating systems provided time-sharing where multiple processes and even users could share the same computer. Often computers would have a single CPU and the simplest form of time-sharing was cooperative multi-tasking where each process voluntarily would yield execution time the remaining processes. Earlier versions of Windows and Mac OS used cooperative multi-tasking. One process failing to cooperate on these operating systems could halt the entire system resulting in a frustrated user. However, more advanced hardware allowed the operating systems to implement preemptive multitasking where the operating system with help from the hardware quickly could switch between executing multiple processes. Small time slices allocated to each process allowed all processes to appear responsive and running in parallel. Each process would also be protected from the other processes providing a more stable foundation for running in parallel. Unix which was created in 1969 used preemptive multitasking as the basis of its architecture. Multitasking can be a quite efficient way of utilizing hardware resources. If a serial task can be split several parallel tasks the total execution time will often be shorter even if these tasks only execute on a single CPU. This is because most tasks involve interacting with external devices like disks or the network and when one task is waiting for a response from the external device another task can execute on the CPU increasing the utilization of the CPU and shortening the total execution time. On Unix and similar operating systems you can achieve parallelism by creating new processes. Complex batch jobs execute by executing many independent processes that work together to produce a final result. The drawback of this approach is that it takes non-trivial amount of time to create a new process and because processes are protected from each other they need some way of exchanging data. A file can be used or something more advanced but also puts a burden on the system. Multitasking using processes is limited by how fast you can create processes and exchange data between these processes. To improve on this, threads were invented. Threads are like processes except they are contained within a single process and also share the resources (in particular memory) of that process. Initially only processes provided parallelism and in that context I believe it makes more sense to describe a thread as a "lightweight process". Threads are a natural evolution of using processes to achieve parallelism. But why is parallelism on a computer interesting? On a multi-user computer it is a matter of sharing the computer but even on a single-user computer you can get better utilization of hardware (in particular CPU) if processes and threads run in parallel. Also, modern hardware (even on the phones we keep in our pockets) has CPU's with multiple cores allowing processes and threads to execute truly parallel.
Great starting explanation. Thanks! I'm writing phone apps, and I'm using threading, but in a 'copy-and-paste-and-don't fully-understand-why-it-works' way. I'd love some elaboration on things like async for file transfers and what some best practices are for threading. I'm always looking for ways to improve my apps by 'hiding' processing from the user and keeping the UI responsive.
So Thank you for adding to our knowledge. And Your hair is beautiful...I Like It! :)
Last modified on 2012-08-14