pminten (16) [Avatar] Offline
#1
In Chapter 5, sections 5.1 and 5.2 you talk about concurrency and parallelism.

It's certainly not a bad text but I have the feeling the distinction could be made a little more clear, especially since this tends to be one of the things confusing newcomers to concurrent systems.

Let me just write up how I understand the distinction, maybe that will help.

Concurrency is about having multiple threads of execution. It's about doing things independently of another. While the degree of independence differs between concurrency mechanisms (coroutines are very tied together, processes very independent, processes on different machines extremely independent) concurrency is always about independence of execution.

Concurrency is always visible to the programmer and it enables the programmer to do things the programmer couldn't do before or that was hard to do. For example concurrency lets you have a responsive GUI while you have a long running background task or do a blocking read from the filesystem or the network.

Parallelism is about doing things more efficiently by actually running computations simultaneously. Usually efficiency means getting a job done faster but it could also mean using less power (in certain situations 4 small cores can be more power efficient as 1 big core but still as fast).

Parallelism, in principle, is invisible to the programmer. A sufficiently advanced compiler could automatically produce instructions to run parts of a program in parallel. In practice this is very hard. CPUs and FPUs internally do run as much as possible in parallel but at a higher level than single or a few instructions this is difficult. Some languages like Haskell have parallelism libraries that do work fairly transparently. Though the programmer still has to write down where parallelism should (or could) take place the denotational semantics (what the program does, as opposed to how fast it runs) don't change. For example you could replace `map myFunc list` with `parMap myFunc list` and always get the result in both cases.

For much lower level languages processors these days have partial solutions for telling them to run stuff in parallel. SIMD (single instruction, multiple data) instructions let you tell a processor: add 4 to each of these 8 bytes. Processors can use such instructions to do certain operations much faster than otherwise by employing parallelism. Using this kind of stuff effectively is a bit tricky though.

The confusing bit of concurrency and parallelism is that most languages implement parallelism by means of concurrency. This unfortunately makes programmers think that parallelism implies non-determinism. Parallelism is merely about making things more efficient. Concurrency is about changing semantics, about changing how code works.

You write "It's important to realize that concurrency doesn't necessarily imply parallelism". The reverse is also true, parallelism doesn't imply concurrency. In BEAM languages it does, but understanding the difference makes it a lot easier to reason about these things.

For example if you want stuff to run faster typically you want the code to work as much as possible as a sequential algorithm, just faster. So you try to keep communication between processes to a minimum. If you want to run things independently however you typically have more complex communication patterns. Knowing what you're trying to achieve is very useful for figuring out the structure of your application.

Ahem, long post, hope it is in any way useful. smilie
sjuric (86) [Avatar] Offline
#2
Re: On Concurrency and Parallelism
Hi Peter,

Thank you for the useful feedback, as usual.

I agree with most of your points. I actually gave a 20 minutes talk recently (video not yet available), where I argued that concurrency to me means running separate things separately.

I'll consider if I can make the distinction more clear, but I also want to avoid entering into a large discussion, so I don't mind simplifying the topic for the purpose of the book, in particular the BEAM VM.

Best,
Saša
pminten (16) [Avatar] Offline
#3
Re: On Concurrency and Parallelism
Yeah, it's easy to write a full page on a topic, hard to condense it in something fitting.

Haven't read further actually, so I don't know if this is already in, but it may be useful to point out that you can get parallelism without introducing non-determinism at the same time. For example by using pid matching to receive the results in a specific order.
sjuric (86) [Avatar] Offline
#4
Re: On Concurrency and Parallelism
No, I didn't discuss explicit ordering of multiple incoming messages via a pattern matched receive, if that's what you mean. The main reason is because it's an approach I never needed in production, and I don't think it's a frequent pattern.

So instead, I wanted to focus on the typical message passing pattern relying on a send and an optional response which we await on immediately (essentially the gen_server pattern).

That being said, this is also an interesting comment, and I'll reconsider the possibility of presenting this technique.
pminten (16) [Avatar] Offline
#5
Re: On Concurrency and Parallelism
Yeah, I guess it's true that this isn't very useful in production. I guess in production either you assign unique keys, aren't bothered by the order or combine values using a commutative operation (so order doesn't matter).