Message thread

TPU Chat

Chat on IRC
Internet Relay Chat?
Java Applet client
irc.freenode.net
/join #hprog
Who's on IRC?

2 dimensional Arrays in C++

Posted by RedX [send private reply] at April 22, 2001, 02:46:44 PM

How can I create a 2 dimensional array in C++, which is created on run-time?
I did some experimenting on this some time ago, but I can't find the source anymore. I remember that I could create, fill and read one, but I couldn't
delete it without crashing the program.
So until now I have been avoiding those by using array[x*y]
instead of array[x][y].

RedX

Posted by Psion [send private reply] at April 22, 2001, 05:56:37 PM

Using a one-dimensional array is the only way you can achieve this kind of functionality with arrays directly if you don't know either dimension at compile time. You can use a cop-out solution by allocating an array of pointers to arrays, but that is much slower.

Posted by RedX [send private reply] at April 23, 2001, 12:52:29 PM

That was the way I did it at first, but never got the memoryleak out.
Unfortunately I lost the source of it, maybe if I have some spare time I'll try it again. I hoped it would be possible to use such a array as an speed improvement, since there wouldn't be a mul. Guess I should go with bitshifts then.

RedX

Posted by Psion [send private reply] at April 23, 2001, 02:44:20 PM

There is a "hidden" multiplication for each dimension above 1 of any array you use. You save no time doing it yourself versus using built-in array functionality if your compiler optimizes reasonably.

Posted by RedX [send private reply] at April 24, 2001, 12:33:50 PM

In that case I'll use the bitshifts.
Thanks.

RedX

Posted by pramod [send private reply] at April 26, 2001, 01:18:51 PM

You could try something like this :

char** array;
array = new char*[nrows];
for(int i = 0; i < nrows; i++)
{
array[i] = new char[ncols];
}

Of course you'll have to take all that trouble again delete those arrays.

Posted by Psion [send private reply] at April 26, 2001, 03:53:07 PM

That is the method I already referenced, an array of pointers to arrays. Not only is it a pain to use and open to all sorts of delicious memory leak possibilities, but it also uses more space and probably takes longer to execute (depending on the speeds of the processor's different operations). It has also been linked to nefarious plots to subvert the minds of today's youth with this "rock music".

Posted by pramod [send private reply] at April 29, 2001, 04:51:29 AM

Actually, this is the 'official' method. From where i think, this is also the only one method possible. The borland c++ documentation gives this uses this method in the documentation for the new operator on how to create a mutli-dimensional array

Interestingly, Java is better than c++ in this respect. It allows you do something like :
int array[][] = new int[r][c];

Posted by Psion [send private reply] at April 29, 2001, 10:10:34 AM

The only reason it would be "official" would be because it allows a programmer to use a familiar interface through most of his program. It is a bad idea to use it, as there is _no_ other benefit from it, and it uses more time and space. Java is not better than C++ in this respect if the extra memory and time overhead from Java's dynamic type system is not worth the extra tricks you can use.

Posted by pramod [send private reply] at May 01, 2001, 07:29:30 PM

There's one thing that you can do to avoid memory leaks and the like. Write a template based routine to allocate and free 2 or 3 dimensional arrays automatically. I've written a 2 routines to alloc and free 2 dimensional arrays. Now you can leave all the allocating a deallocating stuff to the two functions. If you're really serious about performance, you could also make them inline. I think writing a similar 3-dimensional array routine is almost trivial.

template <class T>
T** multialloc(int r, int c)
{
T** x = new T*[r];
for(int i = 0; i < r; i++)
{
x[i] = new T[c];
}
return x;
}

template <class T>
void multifree(T **a, int r, int c) // actually c is not required
{
int i;

for(i = 0; i < r; i++)
{
delete[] a[i];
}
delete[] a;
}

You can call them with something like

int** x = multialloc<int>(2, 4);
multifree(x);

About this thing that there's no benefit from using it, I think just being able to write a[x][y] is a benefit in itself because the compiler produces inline code that is anyway faster than a[x * y]. I suppose a good JIT compiler would beat your program in performace if you do a lot of a[x * y]'s

Posted by Psion [send private reply] at May 01, 2001, 07:52:11 PM

First, the formula to find a 2-D index is something like x*COLS+y, not just x*y. :-) Secondly, the repeated dereferencing of pointers needed for each extra dimension will probably take longer than the usual sequence of arithmetic operations used for arrays whose dimensions are known. You also can't get around the fact that you are using more space and getting slower operations. :-)

Posted by pramod [send private reply] at May 03, 2001, 09:17:08 PM

On intel processors, there is no low-level instruction for dereferencing a pointer. You use something like this :

mov [di], ax

to copy the stuff in the ax register to the location *pointed to* by the di register.

the mov instrution takes 1 or some-times 3 processor cycles. two instructions like this for a 2-dimensional array take 2 or a maximum of 6 cycles.

the mul [for multiplication] instruction takes 13-42 processsor cycles. add the time for the addition, [we've only come to x*COLS part yet] this needs 1-3 clock cycles AND the time for the mov instruction [you finally need to do something with the array] another 1-3 cyles and i think is obvious what is slower.

i also think most other processors will have similar characteristics. so, i think you're using a damn slow method and you processor is so fast that you don't know it.

about the extra space part, i suppose a method that is faster a atleast a factor of 3 [is it more !?] is worth the extra space.

Posted by taubz [send private reply] at May 04, 2001, 11:25:56 AM

paramod, where do you think di comes from?

- taubz

Posted by pramod [send private reply] at May 08, 2001, 09:01:21 PM

Well, di is a register. Suppose you have a pointer say int* x, then if your compiler supports inline assembly you would write :

mov di, x
mov [di], 44

on 32 bit platforms you would write :

mov edi, x
mov [edi], 44

this would be equivalent to writing
*x= 44

All I wanted to say is that the dumb method of writing a[x*COLS + y] is just a lot slower.

Posted by Psion [send private reply] at May 08, 2001, 09:26:36 PM

I'm not really arguing against you at this point. However, you won't find local variables stored at constant locations like you have assumed with x.

Well, I'll argue some more! :-) Optimizing compilers will make arrays with COLS constant and a power of 2 much faster. :P

Posted by taubz [send private reply] at May 09, 2001, 11:32:55 AM

I'm arguing against you! :-)

You're saying that the assembly instructions mov di, x; mov [di], 44; substitute as a faster replacement for a[x*COLS+y]=44. But, you have to compute x*COLS+y whether you're using C or ASM. The address doesn't just appear on its own.

given a 2-dimensional array at char **a, how do I get the item in position 5,2?

C: r = a[5*COLS + 2]

ASM:
mov di, 5
mul di, COLS
add di, 2
add di, a
mov r, [di]

They're equivelant (essentially) in every way and take equal processing time.

- taubz

Posted by Psion [send private reply] at May 09, 2001, 11:49:56 AM

t, using an array of pointers to arrays is different than using a 2D array. No multiplication is needed.

Posted by taubz [send private reply] at May 09, 2001, 04:47:59 PM

Oh is that was paramod was referring to? Doh... Nevermind.

- taubz

Posted by faboo [send private reply] at June 06, 2001, 12:47:24 AM

IANAAP (I Am Not An Assembly Programmer) (Not that I know how this deteriated into a Intel assembly discussion, but wtf, right?) but,

When using a macro assembler, a number of quick symbols (like []) are often used in lieu of having the coder memorize multiple versions of a single instruction.

There are _five_ different variations on the MOV instruction:
[from Intel manual]
1.To a register from memory
2.To memory from a register
3.Between general registers
4.Immediate data to a register
5.Immediate data to a memory

Three of these variations involve the implicit realiztion of one of the operands as an address (1,2,5). That is, they _dereference_ the operand. Each of these five variations of the MOV has its own binary representation, and the compiler decides which one to use by context (it seems to me that one should be able to do this explicitly, and I even think I saw it somewhere once, but on a quick perusal the Intel x86 manual doesn't seem to say anything about it).
Anyway, in response to what pramod said previously: The processor (x86 and its derivitives; though with all its shortfalls, who knows about the P4 :) actually has facillities for dereferencing "pointers". I don't believe you can dereference an address within many other instructions, only because it would be quite slow, and the processor would have to MOV the value into a register anyway.

Not to attack you personally, btw :)

faboo

Posted by pramod [send private reply] at June 07, 2001, 08:26:37 PM

I don't know much about things like the P4, but dereferencing a pointer is just using an index register and doing all the things that you normally do with it. You see, I agree that using an index register is efeectively dereferecing a pointer but there isn't much overhead [I think its about 2/3 processor cycles] the way Psion claimed that dereferencing a pointer slows up the method. That is all I wanted to say.

btw, the borland c++ compiler allows you to write forms like :

mathptr = new [r][COLS] where COLS is a compile time constant while r is NOT. [I not been able to get this work, but the documentation metions it. Is this a standard C++ feature ?]

Posted by Psion [send private reply] at June 08, 2001, 07:45:15 AM

I'm not sure of what you meant there, but modern C has always allowed you to leave the most-significant dimension of an array unspecified at compile time, since it is not needed in address calculations.

Posted by hosseingh1 [send private reply] at June 10, 2001, 02:07:16 PM

it is better for you to define a class for your array,and use operator [] for your class .so you create your dynamic memory in your class constructor and then you can use and delete it without any problem.

Posted by Psion [send private reply] at June 10, 2001, 02:26:23 PM

It's not "better". That way will be slower. It all depends on your tastes.

Posted by gian [send private reply] at June 11, 2001, 03:54:12 AM

I believe this is becoming a trivial "I see your knowledge of asm is a good as mine" type thread...

Posted by nt543 [send private reply] at June 17, 2001, 09:20:05 PM

This seems to work for me: (I'm using Symantec C++ 7.5)

char (far *array)[NUM_ROWS];
array = _fmalloc(NUM_ROWS*NUM_COLUMNS);
array[row][column] = value;

Posted by Psion [send private reply] at June 18, 2001, 07:48:41 AM

Yeah, that's the way I was referencing for if you know one dimension at compile time. pramod says that's slower than the array of pointers to arrays method.

Posted by pramod [send private reply] at June 19, 2001, 02:12:27 AM

I actually did some tests on my compiler. It works out that if you don't have cols constant, its about a 100% slower. With cols constant, and not necessarily a power of two, it seems that both methods are almost the same. [I didn't do too many tests in the second case though]

You must be logged in to post messages and see which you have already read.