Tuesday 13 Dec 2011

This month's meetings

WorkingGroups/ToolChain/Meetings
<< <  2011 / 12 >  >>
Mon Tue Wed Thu Fri Sat Sun
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  

Attendees

Agenda

  • Review action items from last meeting
  • Neon alignment hints and addressing modes.
  • movw / movt vs constant pools.
  • 64-bit neon
  • AOB

Action Items from this Meeting

  • Ramana to create a blueprint about movw /movt vs constant pool entries.

Action Items from Previous Meeting

  • TBA

Minutes

  • Alignment hints - vld1.64 being generated instead of vldm instructions. peeling causes performance loss compared to 124 words. Incorrect hints being generated by the compiler after the patch - we are not sure why this is happening. Addressing mode tweaks might be interesting. MEM_ALIGN and it's correctness in the compiler - there are other parts of the compiler that rely on this being correct, therefore it's something that we should investigate further.

References -

http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01810.html

Interaction between TYPE_ALIGN in the vectorizer - if you know stuff is aligned to 128

There is a comment in emit-rtl.c : set_mem_attributes_minus_bitpos

Discussions with Richi about MEM_ALIGN and SPU and there was reliance on 128 bit alignments -

  • 64 bit shifts and QImode values - rearnsha pointed out what could be done to make things better - Hopefully get it doing the right thing. SImode instead of DImode for the shift amount - doesn't do it in neon anymore. What happens with crafty ?

optabs and experimentation .

  • Odd timing behaviours on Panda board - relatively recent Linaro kernels - Dave's been working with TI-LT .
    • 2 different anamolies. A test-loop runs a benchmark 10 times - 1st run takes significantly longer even though it's not cache bound and every so often one of the iterations will be 3x faster. With the latest kernel from the TI folks, the fast behaviour dies. If you find yourself running stuff on a new installation watch out for odd behaviour.
  • Dave Gilbert got a rather nice neon implementation of memchr based - did it using 32 bytes memchr loop- 256 bit alignment . Don't think it's a gain.

WorkingGroups/ToolChain/Meetings/Archive/2011-12-13 (last modified 2013-08-30 11:50:06)