Benchmarking a Nexus 7 (2013 version) running AOSP master built with gcc 4.9 vs. clang 3.5

These benchmarks were run in late October 2014, shortly after getting clang-built AOSP to work. Some bad numbers on clang's side are caused by some bits not yet working, causing timeouts. There is some room for improvement in clang results (tweaking compiler flags etc. has been done for gcc, but not yet for clang, at this point). All tests were run 3 times, numbers listed here are average numbers of 3 runs. Builds were done with the default versions of gcc and clang present in AOSP (master branch as of October 2014), to generate results easily reproducible upstream.

Compile time

make droidcore -j12

gcc

clang

real

112m50.341s

99m30.494s

user

700m20.836s

629m55.376s

sys

82m5.260s

67m46.403s

Binary size

ls -lR |grep -v ':$' |grep -v '^total.*' |grep -v '^[dl]' |grep -v '^$' |awk '{ print $5; }' |while read r; do S=$((S+r)); echo $S; done

gcc

clang

298039677

306041268

Benchmark results (unless indicated otherwise, higher numbers are better)

gcc

clang

AndEBench Native

11566

11626

AndEBench Java

810

762

BenchmarkPi (in ms, lower is better)

219

214

CaffeineMark

27348

27182

- Sieve

27348

27182

- Loop

35565

37746

- Logic

53771

53818

- String

28942

30386

- Float

14979

16194

- Method

20456

16483

CF-Bench

17620

13354

- Native MIPS

2263

1670

- Java MIPS

1227

901

- Native MSFLOPS

868

632

- Java MSFLOPS

869

636

- Native MDFLOPS

571

414

- Java MDFLOPS

570

422

- Native MALLOCS

16193

15420

- Native Memory Read

10016

7375

- Java Memory Read

2371

1864

- Native Memory Write

4106

4645

- Java Memory Write

2424

1936

- Native Disk Read

511

494

- Native Disk Write

289

293

- Java Efficiency MIPS

54%

53%

- Java Efficiency MSFLOPS

100%

100%

- Java Efficiency MDFLOPS

99%

101%

- Java Efficiency Memory Read

23%

25%

- Java Efficiency Memory Write

59%

41%

- Native Score

25065

19088

- Java Score

12658

9532

Geekbench 2

2606

2150

- Integer

1780

1458

- Floating Point

4181

3326

- Memory

2301

2090

- Stream

599

584

Geekbench 3 (ST/MT)

575/1854

577/1354

- Integer

610/2136

614/1541

- Floating Point

517/1978

519/1322

- Memory

625/1043

619/1045

Quadrant Pro

7585

6192

- CPU

26620

18977

- Memory

7133

7921

- I/O

1727

1701

- 2D

247

249

- 3D

2196

2111

Smartbench 2012

3507

3373

SQLite bench (QPS/Time)

5628.717/15.671

6271.138/18.581

- Insert 200

126.984/1.575

100/2

- Insert 15000 TA

10211.028/1.469

6880.734/2.18

- Update 500

106.247/4.706

81.433/6.14

- Update 15000 TA

8241.758/1.82

17301.038/0.867

- Select 15000

5707.763/2.628

4486.988/3.343

- Delete 200

107.817/1.855

86.022/2.325

- Delete 15000 TA

9270.705/1.618

8690.614/1.726

Vellamo

607

crashes

Linpack Pro ST (MFLOPS/s)

100.159/0.84

98.86/0.85

Linpack Pro MT (MFLOPS/s)

291.307/0.58

199.134/0.85

Notes

  • In both Geekbench 3 and Linpack Pro, clang does well in the single-threaded versions, and badly in the multi-threaded versions. This should be re-verified with clang 3.6 snapshots, since a lot of progress on OpenMP has been made there.
  • The build system should be tweaked to make use of some more clang specific features (e.g. -mcpu=krait on a Nexus 7-2013 -- the assertion that a Krait is a Cortex-A15 made by AOSP is suboptimal, in some ways a krait is closer to a Cortex-A9). -Oz may be interesting for binary size.
  • Currently, the clang Nexus 10 build is more stable than the clang Nexus 7 build. Performance should be compared on Nexus 10 as well.
  • ARMv8 builds and performance comparisons TBD

Platform/Android/GccClangBenchmark (last modified 2014-12-29 09:45:45)