Andrew Stubbs

Chung-Lin Tang

Last week

  • PR48250, rehaul arm_legitimize_reload_address(). Richard Sandiford caught a bug of mine where I overlooked the valid index range of NEON quad-word load/stores. Quickly whipped up a fix, soon approved and committed upstream.
  • LP #744754, ICE in NEON struct-mode auto-inc-dec MEMs. Pushed upstream patch for a merge to Linaro 4.5.
  • PR46888, bit-field insert optimization patch. Resumed investigating, mailed Andrew Pinski for more information on that REG_EQUAL note issue he mentioned on gcc-patches; can't quite reproduce it myself.
  • CoreMark ARMv6/v7 regressions: posted a patch set to gcc-patches. Still waiting review.

  • Reported to Bernd and AndrewS on an issue (LP #748138) which seems to be related to the shrink-wrap patch. This ICE does not seem to be avoided by doing -fno-shrink-wrap.
  • A few tasks related to Linaro-Budapest event travel.

This week

  • Do the merge of the new combine patches to Linaro, and test.
  • LP #689887 is still in progress.
  • Hope to experiment with a few more optimization ideas.

Dave Gilbert

String and Memory routines

  • Profiled denbench with perf and produced a set of stats to show which programs spent how much time in libc and how
    • much time was spent in each routine. While some of the benchmarks are good (like aes) and spend almost no time in libc some of the others (MPEG codecs especially) seem to spend significant times in libc.
  • Ran all of denbench through latrace to generate sets of library calls; post processed them to extract the section between the clock() calls (and hence in the timed portion) and analysed the hot library calls. I've looked at some of the output but not all of it yet; I get output like:
  • Memcpy stats (dst align/src align/length/number of occurrences/total size copied)
    • memcpy: 0,0,1 , 1588520, 1588520
    • memcpy: 16,28,4096 , 1, 4096
    • memcpy: 4,20,16384 , 855, 14008320
    • This shows that for a bunch of tests they do an inordinate number of 1 byte memcpy's, and a few hundred larger memcpy's with an address %32 which is 4 (and destination %32 is 20) - so not aligned but at least equally misaligned.
  • Started writing up a report on some of the stats
  • Also started to try and extract the same stuff from SPEC2k6

QEMU

  • Tested Peter's QEmu release earlier in the week (On Lucid so didn't hit his natty bug)
  • Wrote up a couple of specs (one for TrustZone and the other for Device Tree integration)

Ira Rosen

Ken Werner

libunwind

  • added initial support for resuming at a certain stack frame
  • posted unw_resume support plus some some testsuite fixes on the ml
  • there are still some issues left if signal handlers/frames are involved

Marcin Juszkiewicz

Michael Hope

Mounir Bsaibes

Peter Maydell

maintain-beagle-models

  • qemu-linaro 2011.04 tested and released
  • had to do another last minute -1 respin to fix a problem caught by ubuntu package builds; we need to come up with a process that lets us do test package builds prior to release so we can fix this sort of issue in a less last-minute fashion

merge-correctness-fixes

  • sent patch: fix semihosting SYS_HEAPINFO (seems to have issues though)
  • sent patch: UNDEFs in Neon load/store space
  • sent patches: fix build issues on sparc
  • sent patch: bump the initrd load address to work with bigger kernels
  • sent patch: set Invalid flag for float-to-int conversion of NaN
  • sent patch: move vld/vst multiple to helper functions
  • reviewed patches from Aurelien doing some general softfloat cleanup
  • sent out a version of my performance counters patch which just does a basic dummy implementation without the cycle counter (since the cycle counter bits were going off down a blind alley rather and this part is the last thing needed to be able to boot Linaro vexpress images on stock upstream QEMU)

other

Ramana Radhakrishnan

Revital Eres

Richard Sandiford

This week

  • Iterated with upstream on some of the vectorisation patches. I think only half a patch (the ARM implementation of array_mode_supported_p) is still pending review; everything else has been approved.
  • Backported the vldN and vstN intrinsics to Linaro 4.5.
  • Finished off the microbenchmarks for libav.
  • One of the problems in the original libav output was that the vectoriser didn't realise that a group of N accesses really did form a group. It instead generated N separate interleave operations and took 1 vector from each.
  • Submitted a fix for that, which is now committed upstream. Updated the libav wiki page with the new, improved output. (This actually allowed more libav loops to be vectorised, as well as improving the output from some of the existing ones. I haven't looked at the new ones. I expect this comes from interleaved stores.)
  • Wrote an arm_neon.h version of Android's scanline_t32cb16_neon and compared it with the original.
  • Started (and only started) seeing how the vectorisation stuff affects DENbench.
  • Backported the dwarf2out OOM fix to Linaro 4.6.

Next week

  • Do something useful on DENbench.
  • If all goes well, commit the vectorisation work upstream and backport it to Linaro 4.5 and 4.6.

Ulrich Weigand

GDB

  • Created Linaro GDB 7.2-2011.04-0 release.
  • Committed patch to fix accessing "fpscr" register to Linaro GDB.
  • Failure to disable address space randomization (bug #616001) has been fixed in the kernel; closed the Linaro GDB bug.

GCC

  • Ongoing analysis of bug #759409 (Profiled bootstrap fails). Identified two independent problems, one related to a new CodeSourcery feature, and one a mis-optimization of final-stage cc1plus which is also present in FSF GCC 4.5 (PR 43085). Ongoing investigation to track down the root cause of the latter problem.

Schedule

  • Public holidays 04/22 - 04/25.

WorkingGroups/ToolChain/ActivityReports/2011-04-22 (last modified 2011-04-26 18:41:08)