Difference between revisions of "Performance Test"

Revision as of 08:39, 28 February 2014

Introduction

This page summarises some results for the performance test of YADE (yade --performance) on a multicore machines. It should give an idea on how good YADE scales.

Test 1

Two versions of YADE are compared to each other and two different machines are used. The test was conducted on the computing grid of the University of Newcastle by Klaus.

YADE versions:

version1 (trunk): 2014-01-25.git-22c2441
version2 (see [1]): 2014-02-24.git-b60d388

Machines:

AMD: AMD Opteron(tm) Processor 6282 SE (64 cores)
Intel: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (16 cores)

[1] : https://lists.launchpad.net/yade-dev/msg10498.html

Performance of Parallel Collider

Fig. 1: Scaling on Intel machine

Fig. 1 shows how version2 (parallel collider) scales in relation to version1 on the Intel machine. Interesting to note that for simulations with less than 100000 particles the scaling is almost not depending on the number of threads and scaling is slightly bigger than one. For simulations with more than 100000 particles things are looking differently. Using the total number of cores on the machine is not recommended, -j12 (and probably -j14) scales better than -j16.

Fig. 2: Scaling on AMD machine

Fig. 2 shows how version2 (parallel collider) scales in relation to version1 on the AMD machine. Similar trend as in Fig. 1 can be observed. However, it seems that the Intel scales better for less than -j12.

Comparison AMD/Intel

Fig. 3: Comparison of Intel vs. AMD for version1

Fig. 3 shows the difference between running version1 on an Intel or AMD machine. The AMD is generally slower (Intel/AMD>1).

Fig. 4: Comparison of Intel vs. AMD for version1

Fig. 3 shows the difference between running version2 on an Intel or AMD machine. Again, the AMD is generally slower (Intel/AMD>1).

Conclusions

The new parallel collider scales good for the --performance test with more than 100000 particles. The scaling for 500000 particles is really good, i.e. -j12 scales by a factor of 6 for both machines. Intel machines perform better (similar observations have been made here [[[1]]]). Finally, I would say that there is an optimum number of threads you should use per simulation. Many cores doesn't always mean much faster. So use your resources wisely.

Revision as of 08:24, 28 February 2014 (view source) Thoeni (talk \| contribs) (→‎Comparison AMD/Intel) ← Older edit		Revision as of 08:39, 28 February 2014 (view source) Thoeni (talk \| contribs) (→‎Conclusions) Newer edit →
Line 42:		Line 42:
	=== Conclusions ===		=== Conclusions ===

−	The new parallel collider scales good for the --performance test with more than 100000 particles. The scaling for 500000 particles is really good, i.e. -j12 scales by a factor of 6 for both machines. Intel ~~machnies~~ perform better (similar ~~observasions~~ have been made here [[[https://yade-dem.org/wiki/Colliders_performace]]])	+	The new parallel collider scales good for the --performance test with more than 100000 particles. The scaling for 500000 particles is really good, i.e. -j12 scales by a factor of 6 for both machines. Intel machines perform better (similar observations have been made here [[[https://yade-dem.org/wiki/Colliders_performace]]]). Finally, I would say that there is an optimum number of threads you should use per simulation. Many cores doesn't always mean much faster. So use your resources wisely.

Difference between revisions of "Performance Test"

Revision as of 08:39, 28 February 2014

Contents

Introduction

Test 1

Performance of Parallel Collider

Comparison AMD/Intel

Conclusions

Navigation menu

Find

Browse

This page

Wiki Toolbox