jones065 (3) [Avatar] Offline
#1
This goes with all due respect since I am finding this book extremely useful, but I keep re-reading MYTH #3 in Chapter 2 and it's gobbledygook. Jon keeps using the word reference as a noun, an adjective, an adverb and a verb. It would be much easier on the reader to use more exact terminology than "reference" for everything that is meant. I'm not convinced he is making any worthwhile argument when he uses the code snippet:


void AppendHello(StringBuilder builder)
{
builder.Append("hello");
}


and then says:

"If I were to change the value of the builder variable within the method - for example with the statement [i]builder = null; --- that change wouldn't be seen by the caller, contrary to the myth"[/i]

Being the devil's advocate, if you create two local instances of StringBuilder and assign the second instance to the first instance and then set the second instance to null, you will get the same result. The first instance will still be alive and well whereas the second instance will be null:

StringBuilder sb1 = new StringBuilder();
StringBuilder sb2 = sb1;
sb2 = null; // sb2 is now null because it doesn't point to sb1 anymore, but sb1 is still referencing a valid object.

So, this change isn't being seen locally either and kind of contradicts Jon's argument since it seems to say this is something that happens only if you use a reference type passed as an argument to a method.
185332 (1) [Avatar] Offline
#2

StringBuilder sb1 = new StringBuilder();
StringBuilder sb2 = sb1;
sb2 = null;  // sb2 is now null because it doesn't point to sb1 anymore, but sb1 is still referencing a valid object.

So, this change isn't being seen locally either and kind of contradicts Jon's argument since it seems to say this is something that happens only if you use a reference type passed as an argument to a method.


Perhaps it helps to think of variables as memory references, like a domain name is a way of referring to an IP address. When you declare a variable, whether you instantiate it or not, you are allocating 64 bits on the stack -- no matter the type of the object*. For objects, that 64 bits represents the memory address of actual object. When you "new" an object, you're allocating an "unknown"** amount of space on the heap to store the actual result of the constructor function (the object). This amount varies on the type of object being created.

So, StringBuilder sb1 tells the computer "reserve enough space to store a memory address on the stack." FOr simplicity, let's say the 64 bits starting at 1000 are reserved, and "sb1" is therefore a convenient label to memory location 1000-1063.
Now the rest of that command:
 StringBuilder sb1 = new StringBuilder(); 
means "call the StringBuilder constructor, store the result in memory, then place the 64-bit address of that memory location in stack bits 1000-1063 . The "value" of sb1 isn't the StringBuilder, it's the memory location of the StringBuilder. Let's say a stringbuilder is 100 bytes, and the program allocates 100 bytes starting at address 0AAA0AAA0AAA0AAA. So the -value- at memory location 1000 is that 64-bit number.
So the next command, StringBuilder sb2 gets called, (which means "reserve enough space to store a memory address") and it's put on the stack at address 1064. "sb2" is not the label for stack bits 1064-1127. This time we don't have to call any constructor (there's no "new") it just copies the value in sb1 into sb2. Now the first 128 bits of the stack are two copies of the StringBuilder's address: 0AAA0AAA0AAA0AAA | 0AAA0AAA0AAA0AAA.
Now, it should be pretty clear what happens in the third step, the memory location sb2 is "nulled" -- for simplicity, let's change it to 0s***. Now the first 128 bits of the stack are 0AAA0AAA0AAA0AAA0000000000000000.

I believe Jon's case here is that if parameters were actually passing the *reference* they would be passing the reference itself, a reference to memory location 1000. Instead, it's passing the *value* in that reference -- "0AAA0AAA0AAA0AAA" to the method. A new reference to that object (for our purposes, memory location 1128 to 1191) is allocated, and the value is copied into that memory location, and that memory location is labeled paramSb3. There's still one object on the heap, but two copies of the value which represents the memory location of that object. That's why paramSb3.Append("I did this in the method") actually changes the content of the StringBuilder, but paramSb3 = null or paramSb3 = new StringBuilder() doesn't. In both cases, memory address 1028-1091 is being changed, in one case with "all zeroes" and in the other case with the address of a new StringBuilder, say at memory location A000A000A000A000. At the end of the method, addresses 1128-1191 go out of scope, and any changes to them are lost. At some point along the line (assuming the method didn't return the address of the new StringBuilder) the garbage collector will reclaim the no-longer referenced StringBuilder at A000A000A000A000.

When you pass a value type "by reference" you pass the stack address
 long anInteger = 42; 
reserves a memory word location (say 2048-2111) and stores a 64 bit representation of the number 42 (000000000000002A if my math is working tonight) in that location. If you pass this variable to a function "by value" (or "normally," without the ref keyword) the method gets 0x2A as an input, which it stores in its own memory location (2112-2175). If you pass it "by ref" it is granted access to memory location 2048 (which happens to hold the value 0x2A). Now changes to this location made in the method would be visible outside the method, as it's accessing the exact same bits, not just a copy of the value.

Skeet's point is that if objects were always passed "by ref" as is claimed, the reference to the original stack memory location (1000 or 204smilie would be passed.

Skeet's point is that the *same thing* happens for both objects and value types. If the ref keyword is not used, the *content* of the stack variable is passed and allocated to a *new* stack variable which goes out of scope at the end of the method call. When the ref keyword is used, the stack memory location itself (not its contents) are passed to the method. In the case of objects, all of these values are themselves references to another memory location (all pointing to the same memory location) so changes to the objects inside the method are reflected outside the method, but changes to the object variable (setting it to null or to a new object are not.

* Objects in a 64-bit system are always 64-bit memory addresses, I *think* value types are allowed to use more or less memory as needed.
**I would have said variable instead of unknown, but it would have been confusing. obviously, it isn't "unknown" but "variable" would have been unclear
*** not what C# actually does, but the important thing is that 1064-1127 no longer has the address of the original object, which is alive & well on the heap at 0AAA0AAA0AAA0AAA, and is referenced by "sb1" -- stack bits 1000-1063.