Community Page
- blog.dhananjaynene.com Jump to website »
-
Subscribe -
Community
-
Top Commenters
-
Popular Threads
-
Recent Comments
- I hope you people realize how inefficient the code you've written is... After you've divided out all the 2's from the number, you still try to find out if it is divisible by...
- Boss!! and Bossess!!!! Please use perl for everything. Perl is ultimate for any purpose. Python is slow (even with psycho, pypy) and Java is over verbose. (But Java is faster) Check modperl before...
- Yes, and I needed to know. What is the smallest prime factor? I guessed around: 7 (Also, your "copy to clipboard" links don't work for me).
- There are supposed to be tab indentations. Not sure how to keep the tabs in a post
- I just wrote this code (works in python 3.0 with the special "//" for division of integers). This code is much faster and will find your prime factors. I take advantage of the fact that...
/var/log/mind
Dhananjay Nene’s free (as in free speech) opinions on all things related to Software EngineeringPerformance Comparison - C++ / Java / Python / Ruby/ Jython / JRuby / Groovy
Started by Dhananjay Nene · 10 months ago
Update (README CAREFULLY) : I am starting to see hyperlinks to his post with only some of the findings being treated as the link title (eg. X is 100 times faster than Y, X faster than Z). I emphasise once again that I have carefully indicated in the original post that this is but one [
... Continue reading »
12 months ago
#!/usr/bin/perl
use strict;
use warnings;
use diagnostics;
use Benchmark qw(:all);
use constant TOTAL = 40;
use constant INTERVAL = 3;
use constant DEBUG = 0;
sub josephus
{
my %alive = map { $_, 1 } (0..TOTAL-1);
my $current = -1;
while (scalar keys %alive 1)
{
$current += INTERVAL;
$current %= TOTAL;
while (!exists $alive{$current})
{
$current++;
$current %= TOTAL;
}
delete $alive{$current};
print "$current just died.\n"
if DEBUG;
}
my ($survivor) = (keys %alive);
print "last to die: $survivor\n"
if DEBUG;
return $survivor;
}
my $bm = timeit (10000, \josephus);
print timestr ($bm);
On my 7-year-old Linux box, I get 6172 iterations per second with this code.
12 months ago
sub josephus
{
my %alive = map { $_, 1 } (0..TOTAL-1);
my $current = -1;
while (scalar keys %alive 1)
{
for (1..INTERVAL)
{
do
{
$current = ($current + 1) % TOTAL;
} until exists $alive{$current};
}
delete $alive{$current};
print "$current just died.\n"
if DEBUG;
}
my ($survivor) = (keys %alive);
print "last to die: $survivor\n"
if DEBUG;
return $survivor;
}
12 months ago
12 months ago
Java:1.5 microseconds (1500 nanoseconds)
PHP 5.2.6 win32: 1238.56 microseconds
Quercus (Free/Interpreter) = 759,98 microseconds
Quercus (Pro/Compiling) = 229,25 microseconds
12 months ago
12 months ago
I wanted a circular list where one wouldn't need to actually access the list structure too frequently (directly pick off the references and go). So the list had to be circular and extend in both directions.
The code could've certainly be written using a linked list, but I suspect it would be a lot slower.
12 months ago
Interesting. Thanks for introducing me to Quercus - I had no idea it existed.
There was one more interesting observation from your comment (something we all know about but did not seem so much obvious until I read your comment) - your java code benchmark was almost the same as what I got, but PHP seemed half as fast. The difference - you were running it on Windows.
12 months ago
There are different ways to count LOC based on the intent. While what you are suggesting is certainly one way - thats the one often used to measure coding productivity. That was not the intention in mind - I actually did this exercise in the context of comparison of languages and was really looking at brevity and readability (something I did not cover in this post) from that perspective it made sense to count all the lines - blank lines, braces etc.
Having said that there are still some minor errors in the line count (eg having some extra blank lines in the python code) .. but I don't think that is significant
12 months ago
- instead of "for i in (1..size)", "(1..size).each do |i|"
- instead of "if last != nil", just "if last"
- slots are nil by default, so remove @prev/@next = nil in the initializer (also, @first=nil)
- set something if it hasn't been set with the "@first = current unless @first" form (or @first ||= current), which is also a bit faster (but ||= isn't, oddly)
Idiomatic Ruby that makes it shorter, but doesn't help performance:
- instead of attr_reader+attr_writer, just use attr_accessor
- single-statement ifs are often put on one line, like "return shout+1 if shout52), almost 25% on time (107 - 78), and is more idiomatic Ruby to boot.
12 months ago
- single-statement ifs are often put on one line, like "return shout+1 if shout LESSTHAN deadif"
- you don't need to say "return" on the last value in a function, so strike "return current" (you can change "return 1" to just "1", but it doesn't save a line)
That saves over 10 lines (63 to 52), almost 25% on time (107 to 78), and is more idiomatic Ruby to boot.
12 months ago
Thanks for the suggestions. The updated timings are jruby : 80 microseconds, ruby 1.9 : 89 microseconds, ruby 1.8.6 : 380 microseconds (the last one increased)
12 months ago
Hi Dhananjay, Nice problem - but the results show how NOT to write it in Python. I took the problem and messed around a bit in the Python shell looking at the patterns needed to go once through the loop, then successive times through the loop. I used a Python list as the datastructure - their is very seldom need to create linked lists of classes in Python. I boiled it down to a simple function of SIX lines: def findlast(chainlength = 40, kill = 3): ..firstinc, c = 1, range(1,chainlength + 1) ..while len(c)>1: ....#print firstinc, c ....c, firstinc = [x for n,x in enumerate(c) if (firstinc+n) % 3], \\ ......................(n+1 +firstinc) %3 ..#print firstinc, c ..return c (I don\'t knoe if indentation is preserved in your comments so replace leading dots with spaces) The relative timings are a speed up of around four times over your solution: Time per iteration for Chain = 386.570000648 microseconds Time per iteration for findlast = 94.0599989891 microseconds The big change is that I only need 6 lines to do it in. If you un-comment the print statements you might better follow the algorithm. I note that Chain(40).kill(3).count == 27, whereas findlast() == 28 On looking at your code in lines 29 and thirty you should beware a possible off-by-one error as the original question has positions counting from 1 where you initialize Persons count from zero. I like Python as even if my solution were wrong, I think it gives me the best chance of getting the algorithm right. Correct then fast, but only if you know what is fast enough. Thanks again for the interesting problem. - Paddy.
12 months ago
I decided to do a more idiomatic solution in Python than your one. As you yourself state, the Python solution closely follows the static language solutions and so it is handicapped.
I came up with this:
..def findlast(chainlength = 40, kill = 3):
....firstinc, c = 1, range(1,chainlength + 1)
....while len(c)1:
........#print firstinc, c
........c, firstinc = [x for n,x in enumerate(c) if (firstinc+n) % 3], \
......................(n+1 +firstinc) %3
....#print firstinc, c
....return c
(Replace initial dots with spaces)
The run time was around four times faster, and its is only SIX lines of code versus the ~33 lines of your static-like classes.
Here is my timings:
Time per iteration for Chain = 386.570000648 microseconds
Time per iteration for findlast = 94.0599989891 microseconds
If you remove the # from the print statements you get printouts to help show what is happeneing.
To create this I used Pythons interactive shell to play around with removing positions in a list, then on how to wrap around and remove more positions from the start of the reduced list. After the experimentation it was straight forward to create the function, (not everything needs to be a class).
I note that your result of Chain(40).kill(3).count == 27 wheras I get findlast() == 28
On looking closer at your lines 18 and 19 I see that you are counting Persons from zero when the problem statement says count from 1. You need to make sure that you don't get an off-by-one error when you report the position as someones life could be at stake :-)
All-in-all I am quite happy with finding a solution with Python, I can concentrate on getting the algorithm right. If I need it faster then I would probably translate the Python algorithm or use the psycho JIT compiler.
- Paddy.
12 months ago
Thanks for the interesting solution. I am taking a look at that.
Your remark on the zero based vs. one based index is interesting - the problem statement says you count from one (which is what shout does), but the primary key for each person in my case is indeed zero based (they could be 'a','b','c' instead of zero or one :) ). Even if I have been coding in python for the last couple of months, its kind of difficult to take the 'C' programmer out of me :) .
12 months ago
Here is the code I would write for this. It works equally well in Java, C or C++.
public class Chain
{
protected static final String versionID = "@(#) $ Id: $ ";
private int size;
public Chain(int size)
{
this.size = size;
}
public int kill(int nth)
{
int[] people = new int[size];
for(int i=0;i 1)
{
curCount++;
if (curCount % nth != 0)
{
people[fillPos] = people[curPos];
fillPos++;
}
curPos++;
if (curPos == realSize)
{
realSize -= (curPos - fillPos);
curPos = 0;
fillPos = 0;
}
}
return people[0];
}
public static void main(String[] args)
{
int ITER = 100000;
runTest(ITER);
runTest(ITER);
runTest(ITER);
runTest(ITER);
}
private static void runTest(int ITER)
{
long start = System.nanoTime();
for (int i = 0 ; i ITER ; i++)
{
Chain chain = new Chain(40);
chain.kill(3);
}
long end = System.nanoTime();
System.out.println("Time per iteration = " + ((end - start) / (ITER )) + " nanoseconds.");
}
}
12 months ago
With a simple addition of two lines (OK and the prior installation of the non-standard package called psyco)
The Psyco JIT compiler gave the following faster times on the same box:
Time per iteration for findlast = 32.1899986267 microseconds
Time per iteration for Chain = 109.370000362 microseconds
So, that is an order of magnitude speed up.
- Paddy..
12 months ago
public class Chain
{
private int size;
public Chain(int size)
{
this.size = size;
}
public int kill(int nth)
{
int[] people = new int[size];
for(int i=0;i 1)
{
curCount++;
if (curCount != nth)
{
people[fillPos] = people[curPos];
fillPos++;
}
else curCount = 0;
curPos++;
if (curPos == realSize)
{
realSize -= (curPos - fillPos);
curPos = 0;
fillPos = 0;
}
}
return people[0];
}
public static void main(String[] args)
{
int ITER = 100000;
runTest(ITER);
runTest(ITER);
runTest(ITER);
runTest(ITER);
}
private static void runTest(int ITER)
{
long start = System.nanoTime();
for (int i = 0 ; i ITER ; i++)
{
Chain chain = new Chain(40);
chain.kill(3);
}
long end = System.nanoTime();
System.out.println("Time per iteration = " + ((end - start) / (ITER )) + " nanoseconds.");
}
}
12 months ago
python - version 2.5.2:
Time per iteration = 137.559359074 microseconds
ruby - 1.8.6 (2007-09-24 patchlevel 111) [x86_64-linux]
Time per iteration = 321.06325 microseconds
PHP - 5.2.4-2ubuntu5.1 with Suhosin-Patch 0.9.6.2 (cli):
Time per iteration = 322.66 microseconds
Obviously the numbers will be different on different setups, but it's interesting that for me, results with ruby and python were similar to yours, but the PHP one was much faster, and comparable to the Ruby result.
I'm running Ubuntu Hardy (desktop), 2GB RAM, Intel Core 2 Quad Q6600.
12 months ago
From wikipedia : Cēterīs paribus is a Latin phrase, literally translated as "with other things the same." It is commonly rendered in English as "all other things being equal."
The exercise is an exercise simultaneously in benchmarking and comparison. It is not an exercise in absolute performance improvement.
This exercise in comparison would be pointless if the algorithms themselves are variable. Often the algorithm might in many cases may make a far bigger impact than the language. The results in case of such an exercise would tell you even lesser about comparative performance across languages than this exercise. One way to conduct this exercise is to take up the algorithm you've suggested and implement it across all languages.
BTW If performance alone was critical, I would actually never even attempt to benchmark after C++ and Java - it would've been an obviously worthless exercise. The important message one gets out of this kind of exercise is to get a "sense" of the comparative performance so that one can exercise more informed judgement, and I really fail to see how a highly sophisticated algorithm is likely to give you any better sense of comparative performance than this exercise does (though I am certain the results themselves will NOT be identical - which is why one should always use many microbenchmarks not just one).
On a separate note, BAD algo is a very subjective point of view. In fact I would argue that in most real life stuations you would not always be able to use a int array as you have suggested, because the person class may have many more attributes than its id alone. If you realise, the algo I have used is not only more readable and maintainable, but is lesser likely to get thrown out and replaced with a different one as the problem domain gets more complex.
Is this the only way to do a language performance comparison - Certainly not. But is this any more or less flawed way than doing a comparison based on a different algorithm - I dont think so.
12 months ago
It would definetly be interesing to see those numbers.
Nice work so far.
12 months ago
1) The performance tests do not take into account
"Software Scalable Solutions". In short; assume you need
to perform these operations for 100's of millions of iterations
in a Super Static State maintaing a Global Context; and Hot Deployment. How would you even be able to do this w/C++, Corba?
2) The Java Code is not optimized; To start :
--------Latency----------
Chain chain = new Chain(40);
--------should be----------
Chain chain =null;
for (int i = 0 ; i ITER ; i++)
{
chain = new Chain(40);
--------and----------
This
{
if (first == null) first = current;
if (last != null)
{
last.setNext(current);
current.setPrev(last);
}
}
Is can be written w/far less cycles; (eg:continue;else,etc..._)
12 months ago
12 months ago
also a thing that kills dynamic languages is attribute lookup,this is do that most of dynamic language implement objects using dictionaries so using static functions with int array is the fastest way for dynamic languages.
about mod operator being slow,if mod is a power of 2 then x % 2^n = x (2^n-1) so 5 % 4 is 5 3.
12 months ago
// compile using: gdmd -O -inline -release josephus.d
module josephus;
import std.stdio;
import std.date;
final class Chain {
private Person first_;
this(int size) {
Person last;
for (int i = 0; i size; i++) {
Person current = new Person(i);
if (first is null) {
first = current;
}
if (last !is null) {
last.next = current;
current.prev = last;
}
last = current;
}
first.prev = last;
last.next = first;
}
Person kill(int nth) {
Person current = first;
int shout = 1;
while (current.next !is current) {
shout = current.shout(shout, nth);
current = current.next;
}
first = current;
return current;
}
Person first() {
return first_;
}
void first(Person first__) {
first_ = first__;
}
}
final class Person {
int count_;
private Person prev_;
private Person next_;
this(int count__) {
count_ = count__;
}
int shout(int shout, int deadif) {
if (shout deadif) {
return shout + 1;
}
prev.next = next;
next.prev = prev;
return 1;
}
int count() { return count_; }
Person prev() { return prev_; }
void prev(Person prev__) { prev_ = prev__; }
Person next() { return next_; }
void next(Person next__) { next_ = next__; }
}
void main(char[][] args) {
int ITER = 100000;
auto start = getUTCtime();
for (int i = 0; i ITER; i++) {
auto chain = new Chain(40);
chain.kill(3);
}
auto end = getUTCtime();
writefln("Time per iteration = %.3f ms", 1.0e3f*(end - start)/ITER / TicksPerSecond);
}
12 months ago
12 months ago
As I work mainly with Flash and PHP, I was very curious to see how ActionScript3 behaves. Good news, it's pretty fast.
Can't wait for an open source server that uses AS3. Maybe an apache module?
here are my results:
AS3 (Activex player 9,0,124,0, nondebug): 50 microseconds
AS3 (Stand-alone player, 9.0.115.0, debug): 150 microseconds
PHP (apache module): 600 microseconds
(on the same machine - Core2 Duo 2.4GHz, php running into vmware/debian. If I run the php version in windows, it gets worse :D)
the AS3 version of the program is on my blog http://www.victordramba.com/?p=16
12 months ago
I made a slightly different version using izip() and count(). It is no fasterbut might be even easier to show how the python list is used as a circular list.
It also has the two lines needed for optimising with psyco.
..from itertools import izip, count
..import psyco
..psyco.full()
..def findlast2(chainlength = 40, kill = 3):
......n, c = 0, range(1,chainlength + 1)
......while len(c)1:
..........#print "n=%3i, (n+1)%%3=%i, c=%s" % (n, (n+1)%3, c)
..........c = [x for n,x in izip(count(n+1),c) if n % 3]
......#print "n=%3i, (n+1)%%3=%i, c=%s" % (n, (n+1)%3, c)
......return c
Again, if you un-comment the print statements and call findlast2() you get a better insight into the algorithm as it will return:
findlast2()
n= 0, (n+1)%3=1, c=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40]
n= 40, (n+1)%3=2, c=[1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 25, 26, 28, 29, 31, 32, 34, 35, 37, 38, 40]
n= 67, (n+1)%3=2, c=[1, 4, 5, 8, 10, 13, 14, 17, 19, 22, 23, 26, 28, 31, 32, 35, 37, 40]
n= 85, (n+1)%3=2, c=[1, 5, 8, 13, 14, 19, 22, 26, 28, 32, 35, 40]
n= 97, (n+1)%3=2, c=[1, 8, 13, 19, 22, 28, 32, 40]
n=105, (n+1)%3=1, c=[1, 13, 19, 28, 32]
n=110, (n+1)%3=0, c=[1, 13, 28, 32]
n=114, (n+1)%3=1, c=[13, 28]
n=116, (n+1)%3=0, c=[13, 28]
n=118, (n+1)%3=2, c=[28]
[28]
12 months ago
You concentrate on the speed of each implementation but speed is no good if it gives wrong results.
How do you know you are getting the right result?
I've walked through my algorithms partial results as you can see and know that I have to return 28. It was straightforward to do in Pythons shell.
On researching the problem further I found this page: http://mathworld.wolfram.com/JosephusProblem.html. Mathworld seems to be good at maths problems and I stuck their 41,3 sized problem into my algorithm with the print statements un-commented and found that my algo gave the same results of the last person being numbered 31 and the second to last being 16.
And on that note I then tried to replicate the triangular table of results numbered (3) on the Wolfram page and came a cropper. I have to withdraw findlast() and make the following modification to findlast2() to handle the special case of choosing every soldier on order:
..def findlast2(chainlength = 40, kill = 3):
......if kill == 1:
..........return [chainlength]
......n, c = 0, range(1,chainlength + 1)
......while len(c)1:
..........#print "n=%3i, (n+1)%%3=%i, c=%s" % (n, (n+1)%3, c)
..........c = [x for n,x in izip(count(n+1),c) if n % kill]
......#print "n=%3i, (n+1)%%3=%i, c=%s" % (n, (n+1)%3, c)
......return c
With the following loops to create the table:
..mx = 10
..print"\nTable showing last positions"
..print "\n%12s" % "kill every:",
..for kill in range(1,mx+1):
......print "%2i" % kill,
..print
..for chain in range(1,mx+1):
......print "\n%12s" % ("chain of %i:" % chain),
......for kill in range(1,chain+1):
..........print "%2i" % findlast2(chain, kill)[0],
..print
..
I then get the following output that tallies with the Wolfram table:
..Table showing last positions
..
.. kill every: 1 2 3 4 5 6 7 8 9 10
..
.. chain of 1: 1
.. chain of 2: 2 1
.. chain of 3: 3 3 2
.. chain of 4: 4 1 1 2
.. chain of 5: 5 3 4 1 2
.. chain of 6: 6 5 1 5 1 4
.. chain of 7: 7 7 4 2 6 3 5
.. chain of 8: 8 1 7 6 3 1 4 4
.. chain of 9: 9 3 1 1 8 7 2 3 8
..chain of 10: 10 5 4 5 3 3 9 1 7 8
A quick modification of the table printing loop to use:
print "%2i" % int(findlast2(chain, kill)[0] == Chain(chain).kill(kill).count+1)
Shows that by allowing for the off by one error, Your Python solution tallies with mine.
What to glean from the above?
* Verification counts.
* Verification and exploration is easy in an interpreted language that gives concise algorithms - it is easier to "throw one away" in the quest for quality.
- Paddy.
12 months ago
public class Chain {
private List people = new ArrayList();
public Chain(int size) {
for(int i=1;i1)
for (int i = 0; i people.size(); i+=nth)
people.remove(i);
return people.get(0);
}
public static void main(String[] args) {
int ITER = 100000;
long start = System.nanoTime();
for (int i = 0; i ITER; i++) {
Chain chain = new Chain(40);
chain.kill(3);
}
long end = System.nanoTime();
System.out.println("Time per iteration = " + ((end - start) / (ITER)) + " nanoseconds.");
}
}
12 months ago
12 months ago
100000; ITERATION and C++ with
1000000;
12 months ago
12 months ago
12 months ago
There is more info on the problem here:
http://www.research.att.com/~njas/sequences/A03...
Including a recurrence relation that gives the answer.
It translates into the following Python recursive function:
def T(n, k):
....if n == 1 : return 1
....tmp = (T(n-1, k)+k) % n
....if tmp == 0:
........return n
....else:
........return tmp
....
for n in range(1,8):
....print "\n",n,":",
....for k in range(1, n+1): print T(n, k),
1 : 1
2 : 2 1
3 : 3 3 2
4 : 4 1 1 2
5 : 5 3 4 1 2
6 : 6 5 1 5 1 4
7 : 7 7 4 2 6 3 5
T(40,3)
28
If only I new enough maths to derive the above!
Oh, here's the timings:
Time per iteration for T = 8.75 microseconds
Time per iteration for findlast2 = 93.1200003624 microseconds
Time per iteration for Chain = 221.560001373 microseconds
I haven't had so much fun with a sequence since Ackermanns function, and that was ages ago :-)
- Paddy.
12 months ago
The code in case anyone cares to try it in a non-virtualized instance of WinXP and/or Vista:
using System;
using System.Diagnostics;
public class Chain
{
private Person first = null;
public Chain(int size)
{
Person last = null;
Person current = null;
for (int i = 0 ; i size ; i++)
{
current = new Person(i);
if (first == null) first = current;
if (last != null)
{
last.setNext(current);
current.setPrev(last);
}
last = current;
}
first.setPrev(last);
last.setNext(first);
}
public Person kill(int nth)
{
Person current = first;
int shout = 1;
while(current.getNext() != current)
{
shout = current.shout(shout, nth);
current = current.getNext();
}
first = current;
return current;
}
public Person getFirst()
{
return first;
}
public static void Main(String[] args)
{
int ITER = 100000;
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
for (int i = 0 ; i ITER ; i++)
{
Chain chain = new Chain(40);
chain.kill(3);
}
stopwatch.Stop();
long totalMicroseconds = stopwatch.ElapsedMilliseconds * 10000;
Console.WriteLine("Elapsed Time: {0}ms, Time per iteration: {1}ms", totalMicroseconds, (totalMicroseconds / ITER));
}
}
public class Person
{
int count;
private Person prev = null;
private Person next = null;
public Person(int count)
{
this.count = count;
}
public int shout(int shout, int deadif)
{
if (shout deadif) return (shout + 1);
this.getPrev().setNext(this.getNext());
this.getNext().setPrev(this.getPrev());
return 1;
}
public int getCount()
{
return this.count;
}
public Person getPrev()
{
return prev;
}
public void setPrev(Person prev)
{
this.prev = prev;
}
public Person getNext()
{
return next;
}
public void setNext(Person next)
{
this.next = next;
}
}
12 months ago
PS Z:\Nuxleus\Public\Source\CodeSamples\Benchmark\Benchmark\bin\Release .\Benchmark.exe
Elapsed Time: 1070000ms, Time per iteration: 10ms
Elapsed Time: 1080000ms, Time per iteration: 10ms
Elapsed Time: 1050000ms, Time per iteration: 10ms
Elapsed Time: 1080000ms, Time per iteration: 10ms
Elapsed Time: 1050000ms, Time per iteration: 10ms
The updated code:
using System;
using System.Diagnostics;
public class Chain
{
private Person first = null;
public Chain(int size)
{
Person last = null;
Person current = null;
for (int i = 0 ; i size ; i++)
{
current = new Person(i);
if (first == null) first = current;
if (last != null)
{
last.setNext(current);
current.setPrev(last);
}
last = current;
}
first.setPrev(last);
last.setNext(first);
}
public Person kill(int nth)
{
Person current = first;
int shout = 1;
while(current.getNext() != current)
{
shout = current.shout(shout, nth);
current = current.getNext();
}
first = current;
return current;
}
public Person getFirst()
{
return first;
}
public static void Main(String[] args)
{
int ITER = 100000;
RunTest(ITER);
RunTest(ITER);
RunTest(ITER);
RunTest(ITER);
RunTest(ITER);
}
public static void RunTest(int ITER) {
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
for (int i = 0; i ITER; i++) {
Chain chain = new Chain(40);
chain.kill(3);
}
stopwatch.Stop();
long totalMicroseconds = stopwatch.ElapsedMilliseconds * 10000;
Console.WriteLine("Elapsed Time: {0}ms, Time per iteration: {1}ms", totalMicroseconds, (totalMicroseconds / ITER));
}
}
public class Person
{
int count;
private Person prev = null;
private Person next = null;
public Person(int count)
{
this.count = count;
}
public int shout(int shout, int deadif)
{
if (shout deadif) return (shout + 1);
this.getPrev().setNext(this.getNext());
this.getNext().setPrev(this.getPrev());
return 1;
}
public int getCount()
{
return this.count;
}
public Person getPrev()
{
return prev;
}
public void setPrev(Person prev)
{
this.prev = prev;
}
public Person getNext()
{
return next;
}
public void setNext(Person next)
{
this.next = next;
}
}
11 months ago
This is sri. Thanks for signing up at enewss.
Regarding your performance comparison between different languages...
Most of the compilers do a very good job at optimizing the loops and hence not good for performance metrics. I would prefer to solve an algorithm for searching for the sake of comparing different languages.
Again, to be frank, we are comparing apples to oranges because the languages you chose are invesnted for specific purpose and hence shouldn't be compared..
Your conclusion that performance of c++ language worsens with freeing up of memory is correct because of reasons you already know. In case of java, jvm handles all of the memory related services freeing up process to do its job, whereas in c++, the user process has to take care of it.
However, there is one subtle difference between c++ and java, if programmed correctly, c++ can beat the shit out of java when it comes to memory allocation.
in c++ objects can be created on stack, whereas in java everything is on the heap. So, in c++, objects can be created and destroyed faster if you make use of stack properly without the need of new() routine/heap_memory.
Again, if programmed perfectly to the finest bit and byte level, c++ can perform very well in real time applications compared to java, as java allocates storage for program variables and structures with highest common denominator among all the platforms available. In c++, you have the control on storage size.
Performance measurement is an art, industry always has fine tuned their applications only to come up with a biased report to almost always give high marks to their products/applications.
Neverthless, your blog made me recollect things which i almost forbot about.
Thanks
Sri
11 months ago
There is no reason why you should write java methods like
public int getCount()
{
return this.count;
}
And C methods like
int count() { return _count; }
This gives you a wrong line count.
As you say, you should compare apples to apples, so you should use the same rule for code formatting on all the languages.
BTW, Java Code Conventions advise us to write our Java methods with { at the end of the line, like:
public int getCount() {
and not in the beginning of the next line like you did.
Please, reformat the codes and recount the lines.
11 months ago
11 months ago
One reason is because games have LOTS of math and Java is still very slow at intensive math calculations in comparison to C++.
This benchmark isn't worth much because it doesn't do anything useful.
11 months ago
Instead of
int ITER = 1000000;
use
int ITER = 100000;
as other codes.
11 months ago
11 months ago
11 months ago
@and - I don't know COBOL (actually the last time I coded with it was in 1990).
@Mauricio - I believe more and more games are starting to be written using Java. Games often require not just a fast gaming engine (which I believe Java has started contributing to) but also highly optimised Graphics APIs (last I checked there was no DirectX / OpenGL using Java directly. :) ). However I am sure C will in all likelihood continue to have an advantage over Java in games.
@Vitor - I have already remarked above (in the comments) - the number of iterations do not make any significant difference to the results (though the fact that these are different is accidental and not intentional)
11 months ago
I would offer a rather different view than what you stated.
If all compilers were able to optimise all loops equally I would agree with you. However the fact of the matter is that they dont. Hence such a comparison is not entirely unwarranted.
If users were attempting to use languages for the reasons they were invented and not for others I would go along with you. However Java was invented for set top boxes. But it comes up time and again as an option in networking, games, transactional processing, databases etc. etc. I think it is fair to compare them.
In both c++ and java the user process has to handle the memory allocation and deallocation. The difference is that in Java *sometimes* the automatic garbage collector because it works on deallocation in a bulk mode, has an opportunity to optimise memory management far beyond what any finely crafted developer specified memory allocation can do. Hence *sometimes* java memory management can be faster than (I don't know if it can beat the shit out of) hand crafted C++ code.
C will allow you fine control on storage sizes down to the byte level. There are scenarios where this is really useful (eg. embedded) and there are scenarios where this may not be. It is precisely because of this, an architect requires to achieve the appropriate tradeoff between control / speed / ram (C++) vs. high developer producitivity and agility (Python / Ruby) with Java somewhere inbetween.
11 months ago
Also, PHP has an extra line between 8 and 9, so it would be only 84 lines. Also, you don't need to close the script with ? if the file has only PHP code, which would lead us to 83 lines. More to say: you don't nees {} for 1 line ifs, which would reduce 2 more lines, leaving us with 81 lines.
And I could keep fixing your code for days, but I gotta better things to do than see your PHP 4 code (yes, you did not used PHP 5 code! meaning the PHP engine also slows down to try toi execute the code as PHP 4!).
11 months ago
Your php code does not look good but that's fine.
I just want to corrent Pablo Sánchez saying that PHP code is not PHP5 code. You didn't have __construct() and __destruct() in PHP4, they were introduced in PHP5.
Oh, and PHP CLI version is slower AFAIK. Apache module is faster (but that way you'd be testing apache also, I see the problem).
11 months ago
CPython is able to access instance variables more quickly if they're explicitly declared, as:
class Person(object):
__slots__ = ('count', 'prev', 'next')
class Chain(object):
__slots__ = ('first',)
That sped things up from 159 to 135 ms/iteration. You can shave a bit more time off (=128 ms) by using pointer comparison, i.e., "is not" and "is" instead of "!=" and "==". Running under Psyco with these improvements, it finishes in 54 ms.
Jython needs a different tweak at the moment. By switching from new-style to old-style classes:
class Person:
class Chain:
the runtime went from 476 to 187 ms/iteration.
I'm a Jython developer and needless to say the slowness of new-style classes is on our short list for performance improvements.
11 months ago
I fear for the new comers around here that come to appreciate his misguided judgement of things.
Anyway, in the sake of not wasting your comments space:
@Riley How does Python deals with continuous instance creation inside loops? As others have noticed, this adds some dimension to the overall benchmark, with the memory management (or (de)allocation scheme) and associated optimizations playing important roles in the result. I think this might obfuscate what can be said about dispatch, lookup caches, inlining and so forth.
11 months ago
11 months ago
The C++ benchmark, on the other hand goes through the more expensive C++ heap allocation, but also deallocates that memory precisely where the memory use is finished.
This isn't an excuse for C++, just an explanation of what I believe to be the important reason for the difference between the two. Still, the Java result is nice.
11 months ago
I had precisely the same hypothesis that you are presenting. Even if you remove the deallocation calls from C++ completely, interestingly the java code is still much faster (but the gap between the two is a lot smaller)
11 months ago
11 months ago
11 months ago
11 months ago
10 months ago
import time
from collections import deque
ITER = 100000
START_LEN = 40
KILL_EVERY = 3
start = time.time()
initial_chain = range(1, 1+START_LEN)
for i in xrange(ITER):
chain = deque(initial_chain)
while len(chain) > 1:
chain.rotate(-(KILL_EVERY-1))
chain.popleft()
end = time.time()
print 'Time per iteration = %s microseconds ' % ((end - start) / float(ITER) * 1e6)
10 months ago
It could be cast in an object form with the same external interface as the original. In some sense that makes it the same object implementation, since it shouldn't matter whether we implement the linked list pointers in Python or use the ones in the compiled deque code.
from collections import deque
class Person(object):
..__slots__ = 'count'
..def __init__(self, count):
....self.count = count
..def shout(self, cnt, nth): return cnt # dummy method
class Chain(deque):
..def __init__(self, size):
....alist = [Person(i) for i in xrange(size)]
....deque.__init__(self, alist)
..def kill(self, nth):
....n1 = -(nth-1)
....while len(self):
......self.rotate(n1)
......last=self.popleft()
....return last
10 months ago
On my machine, making a loop of 10000000 iterations:
java version: 26922 milliseconds
my c++ version: 21422 microseconds
see details on whttp://www.bigno.it/speed/cppspeed.html
10 months ago
www.bigno.it/speed/cppspeed.html
sorry for the mistake
10 months ago
my c++ version: 21422 milliseconds
sorry, previously I wrote microseconds intead of milliseconds