High-volume sampling: algorithms’ comparison
Goal: To determine the most efficient way to perform high-volume sampling.
Set-up: The sampler's goal is to give permission to sample use a given sampling rate; (approximately) each 1000th request should result in actual sampling. There are several samplers:
Then 10 runs are performed, using a variety of scenarios. 10M sampling attempts are performed during each run. Time to complete each run is recorded and then the average is calculated.
The scenarios:
Conclusion: in multithreaded environment sharing Random variable produce worse results than recreating a new Random each time (even if the latter means creating and gc-ing random instances).
Source code is accessible at https://github.com/mykyta-protsenko/samplerPerformance.git
The project was compiled and in JDK 1.6 compatibility mode.
Set-up: The sampler's goal is to give permission to sample use a given sampling rate; (approximately) each 1000th request should result in actual sampling. There are several samplers:
- AtomicSampler: uses AtomicInteger to keep track of sampling attempts;
- RandomSampler: uses an instance-wide Random instance to determine if a sample should be taken.
- RandomSampler2: uses a local Random instance to determine if a sample should be taken, which is initialized each time.
- XorShiftSampler: uses a XOR-Shift algorithm to produce psedo-random numbers (http://www.javamex.com/tutorials/random_numbers/xorshift.shtml)
Then 10 runs are performed, using a variety of scenarios. 10M sampling attempts are performed during each run. Time to complete each run is recorded and then the average is calculated.
The scenarios:
- Single-threaded sampling. All runs are done using one thread and one instance of sampler.
- Multi-threaded sampling (10 threads, 1 sampler per thread).
- Multi-threaded sampling (10 threads sharing 1 sampler instance).
1 thread
|
10 threads,
1 sampler per thread |
10 threads
sharing 1 sampler instance |
|
AtomicSampler |
125
|
676
|
7664
|
RandomSampler |
143
|
588
|
7039
|
RandomSampler2 |
622
|
6758
|
6519
|
XorShiftRandomSampler |
211
|
813
|
791
|
Conclusion: in multithreaded environment sharing Random variable produce worse results than recreating a new Random each time (even if the latter means creating and gc-ing random instances).
Source code is accessible at https://github.com/mykyta-protsenko/samplerPerformance.git
The project was compiled and in JDK 1.6 compatibility mode.
Very useful information you have shared. Thank you for the updates.
ReplyDeleteJava Online Classes
Learn Java Online
I read this blog, a Nice article...Thanks for sharing and waiting for the next...
ReplyDeleteSelenium Online Training
Selenium Training Online