<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
  <channel>
    <title>using the preprocessor to do something N times - C++ - tribe.net</title>
    <link>http://cpp.tribe.net/thread/faec10ae-2697-4698-af8a-0b1fec29eb71?format=rss</link>
    <description>Tribe.net. Local Connections</description>
    <item>
      <title>Re: using the preprocessor to do something N times</title>
      <link>http://CPP.tribe.net/thread/faec10ae-2697-4698-af8a-0b1fec29eb71#dc2d1cb8-3b55-4ed0-b300-d4e3a9a4a5bd</link>
      <description>Jason,&#xD;
&#xD;
Yes, you can do this with templates.  It's called meta-programming, and it's got its' own slew of weirdnesses about it, and like any tool, it can be used to do cool things (good), yet also be used to write code that is absolute hell to figure out (evil).  Basically, with templates, you can "run" code at compile-time, without ever "running" the compiled code.&#xD;
&#xD;
Here is a URL that describes some of what is possible with template metaprogramming:&#xD;
&#xD;
http://osl.iu.edu/~tveldhui/papers/Template-Metaprograms/meta-art.html&#xD;
&#xD;
Regards,&#xD;
&#xD;
John&#xD;
&#xD;
Falling You - exploring the beauty of voice and sound&#xD;
http://www.fallingyou.com</description>
      <pubDate>Sun, 30 Jul 2006 21:36:29 GMT</pubDate>
      <guid isPermaLink="false">http://CPP.tribe.net/thread/faec10ae-2697-4698-af8a-0b1fec29eb71#dc2d1cb8-3b55-4ed0-b300-d4e3a9a4a5bd</guid>
      <dc:creator>John Michael</dc:creator>
      <dc:date>2006-07-30T21:36:29Z</dc:date>
    </item>
    <item>
      <title>Re: using the preprocessor to do something N times</title>
      <link>http://CPP.tribe.net/thread/faec10ae-2697-4698-af8a-0b1fec29eb71#7dfd4756-7e9b-4df4-99e1-8039c988d690</link>
      <description>btw I was thinking of using a macro like:&#xD;
&#xD;
#define BODY x += p[a++]&#xD;
&#xD;
then in the loop body doing:&#xD;
 BODY;&#xD;
 BODY;&#xD;
 BODY;&#xD;
etc...&#xD;
&#xD;
Turns out that this kills the pipelining since the proc must increment a before issuing the read to the memory controller.  The best speed I get comes from hardcoding the offsets as in the code in my prev. post.</description>
      <pubDate>Sun, 30 Jul 2006 20:35:58 GMT</pubDate>
      <guid isPermaLink="false">http://CPP.tribe.net/thread/faec10ae-2697-4698-af8a-0b1fec29eb71#7dfd4756-7e9b-4df4-99e1-8039c988d690</guid>
      <dc:creator>Jason</dc:creator>
      <dc:date>2006-07-30T20:35:58Z</dc:date>
    </item>
    <item>
      <title>Re: using the preprocessor to do something N times</title>
      <link>http://CPP.tribe.net/thread/faec10ae-2697-4698-af8a-0b1fec29eb71#3632b278-de32-4b65-aeb6-0070196fe1b9</link>
      <description>You might be surprised.  Unrolling the follwing naive loop:&#xD;
&#xD;
		x = 0;&#xD;
		for(int a = 0; a&amp;amp;lt;READ_UNROLL_BLOCKSIZE; a++)&#xD;
		{&#xD;
			x += p[a];&#xD;
		}&#xD;
&#xD;
To the following:&#xD;
		int a;&#xD;
		x = 0;&#xD;
&#xD;
		// we will use 16-fold unroll&#xD;
		int blockLimit = READ_UNROLL_BLOCKSIZE &amp;amp; ~15;&#xD;
		for(a = 0; a &amp;lt; blockLimit; a+=16)&#xD;
		{&#xD;
			x += p[a + 0];&#xD;
			x += p[a + 1];&#xD;
			x += p[a + 2];&#xD;
			x += p[a + 3];&#xD;
			x += p[a + 4];&#xD;
			x += p[a + 5];&#xD;
			x += p[a + 6];&#xD;
			x += p[a + 7];&#xD;
			x += p[a + 8];&#xD;
			x += p[a + 9];&#xD;
			x += p[a + 10];&#xD;
			x += p[a + 11];&#xD;
			x += p[a + 12];&#xD;
			x += p[a + 13];&#xD;
			x += p[a + 14];&#xD;
			x += p[a + 15];&#xD;
		}&#xD;
		&#xD;
		// process the remaining reads&#xD;
		for (a = blockLimit; a &amp;lt; READ_UNROLL_BLOCKSIZE;)&#xD;
		{&#xD;
			x += p[a++];&#xD;
		}&#xD;
results in a significant perf increase.  If we run the routine several million times to get a nice sample wed see numbers like this:&#xD;
&#xD;
  Running : Loop Unroll Read&#xD;
  Running unoptimized version&#xD;
  Elapsed time [6.33] seconds&#xD;
  Running optimized version&#xD;
  Elapsed time [3.17] seconds&#xD;
&#xD;
During the unrolled loop we do 16 read operations before we hit the conditional.  This allows the processor to do a lot more pipelining.  Unrolling more than 16-fold appears to result in negligible further improvement however.</description>
      <pubDate>Sun, 30 Jul 2006 20:26:09 GMT</pubDate>
      <guid isPermaLink="false">http://CPP.tribe.net/thread/faec10ae-2697-4698-af8a-0b1fec29eb71#3632b278-de32-4b65-aeb6-0070196fe1b9</guid>
      <dc:creator>Jason</dc:creator>
      <dc:date>2006-07-30T20:26:09Z</dc:date>
    </item>
    <item>
      <title>Re: using the preprocessor to do something N times</title>
      <link>http://CPP.tribe.net/thread/faec10ae-2697-4698-af8a-0b1fec29eb71#75ad724f-13eb-44b9-bc4c-8ad0cd32b70d</link>
      <description>Not that I know of.  You could make an inline function or macro to simplify the code but you'll still have to call the function or macro 16 times.&#xD;
&#xD;
I wonder how much of a performance increase loop unrolling of this type would yield since (I believe) nearly all modern CPU architectures have good branch prediction/multiple execution pipelines for simple loops like this.</description>
      <pubDate>Sun, 30 Jul 2006 19:51:51 GMT</pubDate>
      <guid isPermaLink="false">http://CPP.tribe.net/thread/faec10ae-2697-4698-af8a-0b1fec29eb71#75ad724f-13eb-44b9-bc4c-8ad0cd32b70d</guid>
      <dc:creator>Jon</dc:creator>
      <dc:date>2006-07-30T19:51:51Z</dc:date>
    </item>
    <item>
      <title>using the preprocessor to do something N times</title>
      <link>http://CPP.tribe.net/thread/faec10ae-2697-4698-af8a-0b1fec29eb71#5f60f79a-146a-478a-b63e-0932fb12ebed</link>
      <description>Is there a c++ preprocessor instruction to repeat something a number of times?  &#xD;
&#xD;
I am doing some loop unrolling and I would like to do something like this (making up my own preprocessor command :P)&#xD;
&#xD;
#repeat 16&#xD;
    x += p[a++];&#xD;
#endrepeat</description>
      <pubDate>Sun, 30 Jul 2006 19:27:30 GMT</pubDate>
      <guid isPermaLink="false">http://CPP.tribe.net/thread/faec10ae-2697-4698-af8a-0b1fec29eb71#5f60f79a-146a-478a-b63e-0932fb12ebed</guid>
      <dc:creator>Jason</dc:creator>
      <dc:date>2006-07-30T19:27:30Z</dc:date>
    </item>
  </channel>
</rss>



