Stupid Question 118: The pointer series: What is a pointer?

[To celebrate my first year of programming I will ask a ‘stupid’ questions daily on my blog for a year, to make sure I learn at least 365 new things during my second year as a developer]

What is a pointer?

So there has been a lot of talk about these ‘pointers’. My last few C and Objective C questions – and therefore discussions, have gotten a few ‘you must know about pointers!’ reactions. And I always listen to the advisory board (the awesome devs I chat with on and off line).

Stupid Question 116: Do you need to know C to learn Objective C?
Stupid Question 115: What is the connection between C, C++, objective C and C#?

I know what a pointer is. I think I do. Let’s put my knowledge to the test because I really want to get this right. Here is my best attempt at explaining what a pointer is – rather simplified and I trust you guys and girls to correct me if I am wrong.

A pointer is a variable, and by variable we mean something that doesn’t have a fixed value- it is changeable. The value it holds is the memory address of another variable.

For laughs before you proceed go here

It points to an address – or?
Well, actually it can point to anything, and a reference is actually a pointer (but it points to a variable or object- a sort of alternative identifier)- but often the term pointer is most commonly used to refer to pointers that point to a memory address rather than the high level abstraction we see in for example C# references.

There are different types of memory addresses, physical and logical, but since your app most likely uses a logical (think virtual) address let’s focus on that. When you spin up your application RAM (Random Access Memory) will take hold the memory needed for your application. Variable that you use will get a piece of memory allocated, and that bit of memory will have a memory address. An array of 3 ints will take up 4 blocks of memory (each block is a byte), the last bit indicating the end of the array. You define a pointer like so: int *pointer;

A little spoiler here, but I will talk about pointers in C# tomorrow :D

Comments

Leave a comment below, or by email.

Martin Blore

1/3/2013 5:11:34 AM

Hi Iris,

I've done all of the above mentioned languages, and what you have explained sounds perfectly correct to me. I was impressed by you mentioning that what a pointer points to is not always a physical memory address.

I am a little confused at the "the term pointer is most commonly used to refer to pointers that point to a memory address rather than the high level abstraction we see in for example C# references". A pointer still holds a memory address, what do you mean by a "high level abstraction"?

Thanks :).

Nic Ferrier

1/3/2013 5:16:52 AM

What else can a pointer point to? I've not programmed in Objective-C but as an old time hacker, my understanding of C is that pointers point to memory addresses.

Dima

1/3/2013 5:19:31 AM

Hello, Iris,

I think that your definition is good, but what I would like to mention is that there are constant pointers, so, probably, in general we can't say that the value of a pointer can be changed. For example, you can find some more information  and .

Thanks!

Iris Classon

1/3/2013 5:20:41 AM

Reply to: Martin Blore

A reference refers while a pointer points. So while a reference kind of does the same thing as a pointer, there is a higher level of abstraction there so we aren't referring to the allocated address directly, but indirectly allowing for the garbage collector to do its job. Do I get it? I hope I get it :P

Iris Classon

1/3/2013 5:24:02 AM

Reply to: Dima

I don't think I wrote that the value can be changed (note directly), I wrote that the definition of a variable is that it can be changed- for a variable that would be the correct definition. So maybe we can't say that a pointer is a true variable. But what is it then, just a pointer?

The different types would absolutely be worthy of a blog post on its own, I find this stuff really interesting :D

Dima

1/3/2013 5:26:47 AM

Reply to: Dima

Sorry, markup went wrong. Here are the links:
first and second.

Sahil Malik

1/3/2013 5:29:12 AM

Your definition is correct. A pointer is a variable that points to a memory location.
We use pointers to point to either an object, or a function address. 

Pointers can be of different sizes depending upon the bitness of the operation. So an int* ptr is different from an NSString* which is different from a LPWSTR* etc,

Now here is something really really cool. COM is based on an interface called as IUnknown which has 3 methods - AddRef, Release, and QueryInterface. QueryInterface looks like this - 

HRESULT QueryInterface(
  [in]   REFIID riid,
  [out]  void **ppvObject
);

So the out variable there is a void** .. which means .. it is a way for you to dereference a pointer to a pointer of void type, which means, it is a very neat way to hold a reference to a strongly typed object in a weak typed manner without any memory overhead. This is the genius of COM.

C# tries to abstract developers away from all this nastiness of course, but the price we pay for that is that C# or managed code in general is unsuitable for serious mobile applications where every bit and byte matters. Which is why ObjectiveC is awesome.

Sahil Malik

1/3/2013 5:33:29 AM

Okay more geeky stuff.

Strings have always been a pain in the donkey. Usually languages have represented strings as string* .. i.e. pointer to a location, and you start going backwards from there to find the string, and a string termination character tells you where the string ends.

When you work with C or C++, you have to worry about all that. ObjectiveC uses NSString* to eliminate the complexity while giving you the tightness of C. C# pre-allocates the string so you can treat strings as objects. The problem of course is an overhead when you pass strings between methods, which is why they say, a string is a value type that thinks its an object.

C# and .NET are aimed at one goal - developer productivity. The computer can go to hell. IMO we have ended the decade where we could ignore things like power consumption and CPU cycles. I wonder if C# 6 will give us more control :) .. time will tell. But I'm not betting on it.

Sean Murphy

1/3/2013 5:38:36 AM

And here I was about to go on about, "The cursor (or "pointer") provides feedback to the user. When used properly, choice of an effective and apt cursor choice, such as "pointer" or "move" serves to provide the user with the proper context for the activity that they're currently involved in..."

Well played, Iris! Well played!

-@toronto_designr

Dima

1/3/2013 5:42:52 AM

Reply to: Iris Classon

Yes, unfortunately, these problems with names and formal definitions sometimes happen. It's a variable, but it can be "constant variable", which is itself rather contradictive for me :)

Martin Pilkington

1/3/2013 6:01:46 AM

I wouldn't actually use the term variable. A variable is a distinct concept to that of a pointer. A pointer is often stored in a variable but they are distinct things.

The simplest way to define a pointer is "a pointer points to a location in memory". That is simple enough. The real problem in explaining them is people stop there. They don't say "why do we have pointers?".

When you pass a value into a function in C, it copies the value into a new memory location (referred to as "passing by value"). This mean if you change an argument in a function it doesn't change the contents of the variable you passed in when calling it. This is fine for things like integers, which are only a few bytes, but things like strings, arrays, objects (if in Obj-C, C++ etc) or just a blob of binary data are often more than a few bytes, and can sometimes be several MB (on in some cases even GB). We don't want to copy that every time we pass it into a function.

A pointer is only a few bytes (4 or 8 bytes usually), so doesn't take up much space. Copying a pointer isn't a big memory hog, so we can pass in a pointer to the data and access it without the need for copying. This is referred to as "passing by reference"

Another benefit is to have multiple return values, something not natively supported in C. This is used a lot in the Cocoa APIs for returning errors. Say you have a method -doSomethingWithObject:error:. You create a variable to hold an NSError and pass a pointer to that variable (or more accurately to the memory location represented by that variable). If something goes wrong, the method creates an NSError object and tells the variable to point to that. It then returns NO or nil or some other result to indicate an error has occurred.

Finally, a pointer is useful to defining the start and/or current location when iterating over a block of consecutive memory addresses (for example, an array). There are other uses too, but these aren't encountered by most people and usually refer to low level code that needs to manipulate memory a lot, something most developers never need to do, which is why it's usually not possible in higher level languages.

Stephen Woolhead

1/3/2013 6:14:10 AM

Reply to: Nic Ferrier

A pointer always points to an address in the current logical address range.  Things can be 'mapped' into this address range.  Good examples are memory mapped files, if you have a 1Mb file on disk, this could then be mapped into the memory address range 0x12300000 to 0x123FFFFF and to write to the offset 0xABCD in that file you'd just set a pointer to 0x01230ABCD.

Pointers always* point to memory addresses, what's in that memory address can anything from RAM, a file or control registers for a piece of hardware like a serial port or graphics card.

*assuming you have remembered to set it!

Larry Smith

1/3/2013 11:17:22 AM

Try to solve the following problem without pointers. You can't.

Note: For simplicity's sake, we'll talk exclusively about int's. And we'll use C-like pseudocode.

Write a function called Add1. You can pass it any variable and when you return from the routine, the variable has been incremented.

int x = 5;
int y = 12;
Add1(x);
// x is now 6
Add1(y);
// y is now 13

Now some might say that what the Add1 routine is passed is the value of its parameter (either 5 or 12 in our sample above). But big deal. There's this int, somewhere in memory, that needs to be updated. And passing 5 or 12 doesn't give us the merest hint of where it is. 

What you need to pass isn't its value, but where it is. 

We call that a pointer.

Amusing historical fact -- Early computer architectures didn't tend to have an "increment by one" opcode. So if you wrote "i = i + 1", what you could wind up with was an int in memory with the value 1, that could be added to "i". And to save storage (very scarce and expensive in those days), every time you referred to a constant (in this case, 1), the compiler might always use that same value.

FORTRAN compilers traditionally have passed parameters by reference, not value. They passed pointers to their subroutine parameters. 

And if you passed a constant (say, 1) to an Add1 function, some buggy early compilers would happily modify the constant in memory! So if you wrote (again, in C-like pseudocode)...

int x = 1;
int y = 12;
Add1(1);
y = y + 1;
// y is now 14(!), since the "constant" 1 was modified in memory by the Add1 routine!!!

The above is absolutely true.

Barber

1/5/2013 5:09:59 AM

Great post. However there is one error. Arrays aren't usually null terminated, that's only char* that are null terminated. That is the only case where 0 can always be an invalid value. An int array would otherwise be unable to hold the value zero, which makes no sense. 

Usually you have to manually keep track of the length of th array or create a struct to hold the array and its size. 

Also, someone made an incorrect statement that an int* differs from a NSString*. The only difference is the memory locations they point to! 

Pointers can be of different size, which is determined by the hardware, ie 16bit systems generalky use a short inte for pointers while a 64bit system would use a long int. There is acually a type named size_t in C which corresponds to the hardwares adressing size. 

As some people have mentioned a pointer can point to either the mamory location of a variable, the first index of an array or a function. The pointers themselves don't differ although the comiler will warn or give errors if you try to assign a pointer the value of another pointer of incompatible type. You could do that by explicit typecasting, but you'd really have to know what you are doing.

James Curran

1/8/2013 4:08:44 PM

The analogy I like using is a pointer is like a business card (calling card); it's not the person, but it tell you where to find him.

Pointers and references are completely different things.  They are sometimes mistaken for each other because references are generally implemented using pointers, but the primary rule of OO development is "Implemetation Is Irrelevant"    A reference is in effect a pointer that is always automatically de-referenced; you can only see the object being pointed to, and never the underlying pointer.

jalf

2/14/2013 4:05:36 PM

Reply to: Iris Classon

Pretty much. Conceptually you're spot on, but the garbage collector is sneakier than you give it credit for. In garbage collected languages, a reference typically *does* point directly to the memory address it refers to. The garbage collector just updates the reference on the fly when necessary. But yeah, conceptually, there is a level of abstraction in between, because the pointed-to object might be moved by the garbage collector, and the reference referring to it is *seemingly* unchanged.

Last modified on 2013-01-02