Reviews by Brendan Braybrook


Tie-Judy (0.05) *****

with the release of 0.05, the keys() method has almost doubled in speed.

some timing results:
[6.227614] generated 1000000 mostly unique keys
[3.29436] JUDY: built array with ~1000000 keys
[2.104319] HASH: built hash with ~1000000 keys

oddly, unlike Josh's findings i find native hash performance to be better during array creation.

[1.428312] JUDY: read 999406 sorted keys
[1.55473] HASH: read 999406 unsorted keys
[5.683097] HASH: read 999406 sorted keys

key retrieval is just barely faster with judy, although if you need the keys sorted judy is a clear win (since it stores them internally in a sorted form already)

[0.085982] JUDY: read 980 sorted keys from mda to mdc
[1.327951] HASH: read 981 unsorted keys from mda to mdc
[0.00025] JUDY: read 100 sorted keys from mda to mdc limit 100
[0.907928] HASH: read 100 unsorted keys from mda to mdc limit 100
[0.085711] JUDY: read 194 sorted keys from mda to mdc match /4/
[1.359322] HASH: read 194 unsorted keys from mda to mdc match /4/

the new search method in 0.05 really lets judy arrays shine where sorted results are required. if you have some data where the sort order is important, you can retrieve subsets of the keys/values in a fraction of the time it would take you with a hash - with the hash you'd have to scan each key, then sort after you'd collected your results.

[1.161717] JUDY: read 191777 sorted keys where value matches /4/
[1.691851] HASH: read 191777 unsorted keys where value matches /4/

even the value_re mechanism inside the search method is useful - it takes less time to return all the matching keys than it does to return the keys then do the matching.

synopsis: excellent in applications with large datasets where the data will persist for a while and must be kept/returned in an alphanumeric sorted form.

spitting out pagination data would seem a particularly well suited task, as you could return 100 records and quickly return the next 100 upon the next request.