<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"><channel><title>/var/log/mind - Latest Comments in Turbocharge your string keyed hashmaps</title><link>http://var-log-mind.disqus.com/</link><description>Dhananjay Nene’s free (as in free speech) opinions on all things related to Software Engineering</description><language>en</language><lastBuildDate>Mon, 28 Apr 2008 07:26:17 -0000</lastBuildDate><item><title>Re: Turbocharge your string keyed hashmaps</title><link>http://blog.dhananjaynene.com/2008/04/turbocharge-your-string-keyed-hashmaps/#comment-1209627</link><description>I have many doubts about your optimization:&lt;br&gt;1. You should NEVER use new String(String). It forces the JVM to create a new String object.&lt;br&gt;2. String literals, such as "mykey", are already automatically interned by Java.&lt;br&gt;3. The call to intern is slow, since it performs a lookup in a hashtable containing ALL String literals.&lt;br&gt;4. Hashcode is already stored in String, and calculated the first time it is needed (see source code of String.java)</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">David Shay</dc:creator><pubDate>Mon, 28 Apr 2008 07:26:17 -0000</pubDate></item><item><title>Re: Turbocharge your string keyed hashmaps</title><link>http://blog.dhananjaynene.com/2008/04/turbocharge-your-string-keyed-hashmaps/#comment-1209626</link><description>If performance is the main issue and you are creating a Symbol class anyway, consider putting the associated values directly in Symbol's fields. That will be as fast as it gets.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Dimitris Andreou</dc:creator><pubDate>Mon, 21 Apr 2008 04:49:50 -0000</pubDate></item><item><title>Re: Turbocharge your string keyed hashmaps</title><link>http://blog.dhananjaynene.com/2008/04/turbocharge-your-string-keyed-hashmaps/#comment-1209623</link><description>Calling intern can lead to permgen memory exhaustion see this bug issue on Xstream &lt;a href="http://jira.codehaus.org/browse/XSTR-395" rel="nofollow"&gt;http://jira.codehaus.org/browse/XSTR-395&lt;/a&gt; the fix involves using weak references to get around the cache size issue</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Khalil</dc:creator><pubDate>Thu, 17 Apr 2008 17:15:00 -0000</pubDate></item><item><title>Re: Turbocharge your string keyed hashmaps</title><link>http://blog.dhananjaynene.com/2008/04/turbocharge-your-string-keyed-hashmaps/#comment-1209622</link><description>@khalil Addressed your concerns in an updated version.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Dhananjay Nene</dc:creator><pubDate>Thu, 17 Apr 2008 16:34:32 -0000</pubDate></item><item><title>Re: Turbocharge your string keyed hashmaps</title><link>http://blog.dhananjaynene.com/2008/04/turbocharge-your-string-keyed-hashmaps/#comment-1209621</link><description>Well to begin with String being mostly an immutable object computing the hashcode once seems a natural and  hashCode does cache the calculated value. The reason why fast code is indeed faster  is because the compiler interns String literals &lt;a href="http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html#intern%28" rel="nofollow"&gt;http://java.sun.com/j2se/1.4.2/docs/api/java/la...&lt;/a&gt;)   &lt;br&gt;As to the static HashMap in the symbol class it is I am afraid a memory leak being a cache with no eviction strategy and one that can not be flushed, as a bonus getSymbol is not Thread safe! Static is indeed evil &lt;a href="http://gbracha.blogspot.com/2008/02/cutting-out-static.html" rel="nofollow"&gt;http://gbracha.blogspot.com/2008/02/cutting-out...&lt;/a&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Khalil</dc:creator><pubDate>Thu, 17 Apr 2008 15:54:26 -0000</pubDate></item><item><title>Re: Turbocharge your string keyed hashmaps</title><link>http://blog.dhananjaynene.com/2008/04/turbocharge-your-string-keyed-hashmaps/#comment-1209620</link><description>@Dave. To correct myself, I think if String.intern() is called each time a string key is used, one should get similar results without needing a Symbol class.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Dhananjay Nene</dc:creator><pubDate>Thu, 17 Apr 2008 15:26:00 -0000</pubDate></item><item><title>Re: Turbocharge your string keyed hashmaps</title><link>http://blog.dhananjaynene.com/2008/04/turbocharge-your-string-keyed-hashmaps/#comment-1209619</link><description>@Jacob. I agree with the non-warming up and the non accounting of symbol map lookup during Symbol construction time. Here's the rationale. Actually I am not so concerned with the timings for the usual (actually I should've called it identical). If a programmer can keep track of ensuring that the strings are identical - either explicitly or by always keeping a cached version of String.intern() result as suggested by Dave then Symbol class isn't really required. The 1% keeps on varying between slightly negative and slightly positive across different runs (basically the additional cost of lookup and the cached value of hashcode are being traded off with each other). Yes the test does not reflect the cost of creation of a symbol. But in most programs, the number of times symbol is created can be made to be much much lower than the number of times it is used for lookups.  What you suggest of as a real world example is actually not necessarily a good real world example - since it is quite feasible to ensure that the symbols get created far less often than the getter is called.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Dhananjay Nene</dc:creator><pubDate>Thu, 17 Apr 2008 15:22:07 -0000</pubDate></item><item><title>Re: Turbocharge your string keyed hashmaps</title><link>http://blog.dhananjaynene.com/2008/04/turbocharge-your-string-keyed-hashmaps/#comment-1209618</link><description>Looking at your tests, theres a couple issues:  1) you run the 'usual' test first without a JVM warmup-- so that 1% probably shouldn't even be there, I'd even say you'd hit a negative ratio there.  2) You aren't accounting for the Symbol map lookup in your tests and are using a hard reference to the symbols.  A real world example of the testSymbol would be to walk through each String key, do a Symbol.get(...) and then do the hashmap lookup.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Jacob Hookom</dc:creator><pubDate>Thu, 17 Apr 2008 15:11:11 -0000</pubDate></item><item><title>Re: Turbocharge your string keyed hashmaps</title><link>http://blog.dhananjaynene.com/2008/04/turbocharge-your-string-keyed-hashmaps/#comment-1209617</link><description>@Dave. Have updated the post with the sources to be able to verify my findings. I missed the Strings.intern() part. I think it will certainly make for a simpler Symbol implementation. However the Symbol implementation will still be required.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Dhananjay Nene</dc:creator><pubDate>Thu, 17 Apr 2008 15:04:01 -0000</pubDate></item><item><title>Re: Turbocharge your string keyed hashmaps</title><link>http://blog.dhananjaynene.com/2008/04/turbocharge-your-string-keyed-hashmaps/#comment-1209616</link><description>Doesn't String.intern() effectively do what you want?&lt;br&gt;&lt;br&gt;&lt;br&gt;Of course, this still seems like a pointless optimization.  In my test of doing a million puts/gets, the difference in your two approaches about 100ms, or .104 milliseconds per operation.  Not sure that would make a difference in most application contexts....</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Dave</dc:creator><pubDate>Thu, 17 Apr 2008 14:53:37 -0000</pubDate></item><item><title>Re: Turbocharge your string keyed hashmaps</title><link>http://blog.dhananjaynene.com/2008/04/turbocharge-your-string-keyed-hashmaps/#comment-1209615</link><description>@Jacob The *another* hash lookup you mention will try to check for the identity and then equality of keys. The class Symbol is constructed such that underlying strings are always identical (even if the symbol is constructed twice using two non-identical but equal strings). This is what makes the big difference.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Dhananjay Nene</dc:creator><pubDate>Thu, 17 Apr 2008 14:50:29 -0000</pubDate></item><item><title>Re: Turbocharge your string keyed hashmaps</title><link>http://blog.dhananjaynene.com/2008/04/turbocharge-your-string-keyed-hashmaps/#comment-1209614</link><description>How could that be faster when you are just deferring the regular String lookup to the Symbol map and then only slightly optimizing *another* hash lookup.  If you make a case to keep a hard reference to the Symbol object outside of the map, then you might as well do that for the String itself and get the same optimization.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Jacob Hookom</dc:creator><pubDate>Thu, 17 Apr 2008 14:37:16 -0000</pubDate></item></channel></rss>