(here’s part 1 if you missed it, and part 3 if you’d rather skip ahead)
(full sample code through part 3 is here)
I’m a firm believer that there’s no point benchmarking something that’ll never happen. So we’re imagining a use-case along the lines of a bullet-fest space shoot-em-up, where you might have hundreds of animated entities on screen that you need to process each frame. Units and projectiles in a tower defense -type game might make another good analogy, or a particle system for whatever purpose, or… Just so long as we have lots of “things” that need frequent individual attention, then we have a valid test scenario.
So, let’s make some entities, how about 1000? Seems a reasonably big number for a mobile game, but not too big — we don’t want to get too excessive here or we might hit memory issues that would never happen in real use. (On the other hand, if you DO actually have 100,000 entities, then THAT is what you should benchmark! i’m just trying to keep this in the realm of “generally applicable”.) Though let’s give them at least a couple of properties so that we’ve got at least SOME memory footprint.
1 2 3 4 5 6 7 8 9 10 |
local NENTITIES = 1000 local entityList = {} for i = 1, NENTITIES do entityList[i] = { alive = true, foo = 1, bar = 2, foobar = 12 } end |
Ok, now we’re again ready to “set the stage” with material that’s already been covered elsewhere. (again, hat tip to roaminggamer)
Let’s consider the two most common methods for iterating through those entities, and of the two which has the better performance?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
do local function fn() for i=1,#entityList do local entity = entityList[i] -- see note below end end local elapsed = bench(fn,NREPS) print("for i loop", elapsed-baseline) end -- or: do local function fn() for _,entity in pairs(entityList) do -- see note below end end local elapsed = bench(fn,NREPS) print("for pairs loop", elapsed-baseline) end |
Note: the extra line of code in the “for i” version is to keep this real, and fair. Let’s at least require of our benchmark that we actually obtain a reference to the entity, I mean, what’s the point of iterating through the list at all otherwise?
I won’t post my own results here (because you can run it for yourself if curious, and your results will differ from mine in absolute numbers). Instead, I’ll just make this unsupported claim: on just about any imaginable hardware (and assuming the absence of any wild conditions, like a JIT compiler) you should find that “for i” is about 2.5 – 3.0 times (roughly) faster than “for pairs”. (This shouldn’t come as any great surprise — as the “for i” loop was introduced to exploit the indexed side of Lua’s dual index/hash table implementation.)
For this type of benchmarking, relative numbers are far more important than absolute numbers. So if you’ll accept my unsupported claim on faith (at least until you’ve had a chance to verify it for yourself) then we can move on from here.
That’s all for now, but coming up next: Is there an iteration strategy even faster than “for i”? What about other list operations like insert/remove — might they affect the final decision on which method is “best” overall?