Corona Quick Thought – strict.lua or GLOBAL_lock()

Given that Lua is such a dynamically language, it is trivially easy to accidentally “litter” the global environment with variables you didn’t intend to be global. Even a simple “spelling error” (or a camelCase accident one way or the other) can introduce a new variable, and fail to set the intended variable – a “silent” error that Lua won’t complain about but may have unintended and hard-to-diagnose consequences in your code. For example:

Of course, this problem isn’t unique to Corona. For instance, it’s well-described here along with several approaches to address it, including strict.lua and Niklas Frykholm’s GLOBAL_lock/unlock().

I’ve used both in the past, and both are directly usable in Corona, no special mucking about required. Effectively, what either of them do is lock the global environment, preventing unintended access afterward, and generating an error if you attempt to do so.

If using GLOBAL_lock/unlock you can just copy the code from the wiki. If you wish to use strict.lua, I’d suggest getting it directly from the 5.1.5 source tarball available here (it’s in the etc directory). (as of this writing, Corona is still using JNLua 0.9, so 5.1.5 would be the right match)

To use either, you may need to hunt for a “safe spot” in your code after you’ve declared any intended globals, for example:

Next, you’ll want to give your app a thorough testing, exercising all of its code, as such errors can’t be noticed until execution. Then go about cleaning them all up.

Note that you may have to specifically “exempt” certain Corona modules which themselves leak globals, particularly if using GLOBAL_lock/unlock (which, oddly, is a bit “stricter” than “strict” 😀 ), for example:

Also note that you probably would not want to keep either active in a production build — better to accidentally assign a global somewhere than crash hard with a debug-environment error, just in case your testing didn’t uncover absolutely everything. One way you can do this (there are several of course) might be to wrap all of that code inside some kind of conditional, for example you might test if running on the simulator, and if so stub out all those routines to do nothing instead, so you can leave the rest of your code as-is, for example:

Corona Quick Thought – a reasonably fast 8-bit XOR in native Lua

I recently had a small need for a bitwise XOR, and was disappointed that Corona’s bit plugin is only available at Pro (or Enterprise) subscription levels.

Drat. My need wasn’t great enough to justify that expense. Though note that if you do have need for thousands of XOR’s, particularly if 16- or 32-bit, then you may indeed need the native code of the plugin — interpreted Lua simply can’t match that performance. But if your needs are more modest, then perhaps there’s an alternative…

For myself, I headed off to the Lua Wiki where I knew there to be several pure Lua implementations.

Alas, while there are many capable implementations listed there, they’re all just a bit too capable for my purposes. Because the downside of being fully capable (dealing with 32-bit quantities, etc) and general-purpose means that their performance isn’t very snappy on a mobile device.

I just needed a special-purpose 8-bit xor, for dealing with values you might obtain from string.byte(), for example. So I scoured the list archive and found what looked like a good starting point. (and I couldn’t have aksed for a better authority than Roberto! 🙂 )

Basically all I’ve done here is just cripple it a bit, for 8-bit use only, then unrolled the loop and removed some intermediate values where not reused, just to gain back a bit of performance.

That’s about as far as it made sense for me to optimize. (I suppose you could duplicate and premultiply the table to save that final multiply, but it won’t make much difference. Or memoize it out to the full 64K,.. yikes. etc. My feeling was once you start contemplating those sorts of drastic measures, then you should just be using the bit plugin instead.)

And here’s a simple validation suite against the bit plugin:

Corona Quick Thought – Game Entity Loop Performance, part 3

(here’s part 1 and part 2 if you missed them)
(full sample code through part 3 is here)

Finally! We’ve covered enough foundation to introduce something “new”. Though not really new at all, perhaps just less obvious.

Let’s remember back to our “boring” CS102 Data Structures class, and bring back the linked-list. A singly-linked list would do for these purposes, but I’ll make it doubly-linked anyway, as it’ll prove to be greatly advantageous in the future. (assuming I actually get around to writing a “part 4” of this series to cover insert/remove, maybe a “part 5” to cover entity pool maintenance, etc)

At this point we don’t need anything “fancy”, and in fact we can just install our linked list “over the top” in-place within the existing entity list:

If linked-lists are new to you, all that code is doing is giving each entity a “pointer” to it’s “neighbors”. We also save the first of those links in the variable “head”.

The real surprise, though, is iteration. The lowly while loop, which is entirely generic and not optimized for any particular use over another, really shines:

Again, as before, I won’t post my own test results. Instead I suggest you run it for yourself. No guarantees, but I’d be surprised if you didn’t find the linked list iteration to be easily another 10% or so faster than “for i” (which was our previous performance winner from part 2)
(full sample code through part 3 is here)

That’s all for now, but if you’re at all familiar with linked lists then it ought to be obvious where this series is heading. Things like insert/remove in the context of an entity pool system become absolutely trivial with doubly-linked lists. Until then…

Corona Quick Thought – Game Entity Loop Performance, part 2

(here’s part 1 if you missed it, and part 3 if you’d rather skip ahead)
(full sample code through part 3 is here)

I’m a firm believer that there’s no point benchmarking something that’ll never happen. So we’re imagining a use-case along the lines of a bullet-fest space shoot-em-up, where you might have hundreds of animated entities on screen that you need to process each frame. Units and projectiles in a tower defense -type game might make another good analogy, or a particle system for whatever purpose, or… Just so long as we have lots of “things” that need frequent individual attention, then we have a valid test scenario.

So, let’s make some entities, how about 1000? Seems a reasonably big number for a mobile game, but not too big — we don’t want to get too excessive here or we might hit memory issues that would never happen in real use. (On the other hand, if you DO actually have 100,000 entities, then THAT is what you should benchmark! i’m just trying to keep this in the realm of “generally applicable”.) Though let’s give them at least a couple of properties so that we’ve got at least SOME memory footprint.

Ok, now we’re again ready to “set the stage” with material that’s already been covered elsewhere. (again, hat tip to roaminggamer)
Let’s consider the two most common methods for iterating through those entities, and of the two which has the better performance?

Note: the extra line of code in the “for i” version is to keep this real, and fair. Let’s at least require of our benchmark that we actually obtain a reference to the entity, I mean, what’s the point of iterating through the list at all otherwise?

I won’t post my own results here (because you can run it for yourself if curious, and your results will differ from mine in absolute numbers). Instead, I’ll just make this unsupported claim: on just about any imaginable hardware (and assuming the absence of any wild conditions, like a JIT compiler) you should find that “for i” is about 2.5 – 3.0 times (roughly) faster than “for pairs”. (This shouldn’t come as any great surprise — as the “for i” loop was introduced to exploit the indexed side of Lua’s dual index/hash table implementation.)

For this type of benchmarking, relative numbers are far more important than absolute numbers. So if you’ll accept my unsupported claim on faith (at least until you’ve had a chance to verify it for yourself) then we can move on from here.

That’s all for now, but coming up next: Is there an iteration strategy even faster than “for i”? What about other list operations like insert/remove — might they affect the final decision on which method is “best” overall?

Corona Quick Thought – Game Entity Loop Performance, part 1

(here’s part 2 and part 3 if you’d rather skip ahead)
(full sample code through part 3 is here)

First, before I even begin, a hat tip to Ed at who has already presented benchmarking code and results regarding loop performance. I’ll have to cover a bit of the same ground here just to be thorough, and as a point of reference so that we can explore further.

(also note that there are several Lua-related sites out there with performance tips, much is known on the topic.. worth a quick search for)

I’m going to make this a quickie, just to lay the foundation, as it’s stuff we’re pretty sure we already “know”, but with performance you should never take anything for granted. Here’s the benchmarking code I’ll use:

The first thing we should do is “benchmark the benchmark”. That is: time a “do-nothing” loop to see what overhead is present in the benchmarking framework itself. This result can essentially be treated like a constant once determined for any given platform — it’s the time required to execute all the “invariant” portions of any future test.

That baseline number is hopefully small enough that you can just ignore it. If not, you should subtract it from any future results, as it is just “overhead” of the testing process.

Note that a millisecond accurate timer need not have millisecond precision – that’s why tests should be repeated many times to help “drown out” any such noise. (and, of course, you also need to repeat many times just to get beyond the realm of microseconds)

Note that Corona uses JNLua not LuaJIT, so in theory we shouldn’t need to prime the loop for a JIT compiler. But if you’re reading this and using something other than Corona, and might be using LuaJIT, then you might want that priming loop in there. By default, LuaJIT starts considering compilation after 56 traces, so ~60 ought to do it. You wouldn’t want a mix of both measures.

That’s all for now, just a set-up for the next part(s). By the end, I’ll demonstrate a performant method of looping an object table that might not be obvious.

Corona Quick Thought – excessive “local” in for loops

“Locals are faster than globals” is a oft-repeated Lua performance tip, and it’s true — local variables can be accessed more quickly (by index into the local environment) than less-local/more-global variables (where a wider environment must be accessed by name/key). So you tend to see a lot of “local” in Corona code, which is as it should be, though it sometimes gets carried too far.

For example, it’s not uncommon to see things like the following posted in the forums:

In both cases the local statement is unnecessary. The for loop will automatically create (and make very local) the index variables.

But additionally, it’s a bit misleading — it would seem to imply that those declared local variables will remain and retain their last values after the loop, which is not the case. The declared “i”, “k” and “v” will all be nil after the loop, since they’re never used. In fact, you’ve actually worsened performance (albeit by a minuscule amount – a single VM instruction) by causing Lua to init environment space for variables never used!

Simple take-away lesson: you need not declare index variables used in for loops.

Custom Devices with Corona Simulator (Windows)

As far as I can tell, as of public build 2014.2381, the Windows simulator in Corona doesn’t have built-in support for defining custom devices and resolutions. That is, at least, there’s no front-end GUI for doing so within the simulator itself.

However, the good news is that it does support custom devices — you’ll just have to define them manually. We won’t bother with creating frames and such in this article, just defining the resolution for a “bare” device.

Of course this is all undocumented, unsupported, whatever.. YMMV.

First, make sure Corona isn’t currently running. We’re going to add/modify some files that appear to only be read during startup.

Navigate to your Corona install directory (probably something like C:\Program Files (x86)\Corona Labs\Corona SDK), then into the Resources folder, then into the Skins folder. There you will find all the various device specifications.

You should also find one named “CustomDevice.lua.template”. You can use this as a starting point, or make a copy of some other one. Typically, you’d want to start with whatever is already “closest” to the device you intend to describe.

So, make a copy and name it something appropriate. (note that on Windows you may need to have administrator rights to create/modify files in this directory)

Next, edit the file to describe your intended device. Most of the properties will probably be self-evident. Obviously, the main ones of interest for our purposes here are “screenWidth” and “screenHeight”, so set those appropriately. Remove any reference to “deviceImage” (or any other .png files specified) as you won’t be using one. “screenOriginX” and “screenOriginY” should both be 0 (since there is no offset into the non-existent frame image).

Here’s a sample of one I made for a Nexus 7 (2012) with it’s available 800×1205 resolution in Portrait orientation (where soft menu bar occupies 75px):

Your new device should show up in the list next time the simulator is launched. That ought to at least get you started. Have fun!

Icon Sizes

Yikes, we’re now up to 20 unique icon sizes for a full cross-platform universal build! It wouldn’t be so bad if it weren’t for the fact that icons, in particular, tend to go through a lot of redesign. Sooner or later, most indie developers start looking for some automated tool to generate all the various sizes from a single high-resolution master. And there are plenty of options to choose from for that automation.

(You DO have a high-resolution master, right?? 1024×1024, or more.)

I’ll discuss Adobe Illustrator/Photoshop here, but I think the same concepts will apply to other tool combinations (such as Inkscape/GIMP, for example). And let’s assume your source document is in vector format, because it gives us an additional rasterizing step to consider.

First, unless your artwork is very simple (or you’re just not very discriminating), a “save for web” straight from your vector package probably won’t produce optimal results, particularly at the lowest resolutions. Illustrator, for example, will tend to make things “overly bold” at low resolutions, so fine lines and other details may bleed into others nearby. You’ll likely get better results if you drop it into Photoshop at full resolution for the rasterizing, then downsample from there using either Photoshop itself, or some other tool of your choice. (Though much of this is subjective – for certain images you may actually prefer the artifacts from a certain tool path.)

So, let’s assume we’ve rasterized at full-resolution, we next need to consider resampling/downsizing. I’ll use the current working icon of Eggheadz Bounce as my test case:
icon comparison side-by-side
Here’s our example, using a 1024×1024 master, targetting 72×72. The images were produced as follows:

  • a “save for web” straight from Adobe Illustrator
  • fullsize rasterize in Photoshop downsized bicubic
  • fullsize rasterize in Photoshop downsized bicubic sharper
  • fullsize rasterize in Photoshop, downsized with ImageMagick using ‘catrom’ filter, strongly unsharp-ed
  • via
  • via

The last two are just to illustrate a couple (of the many) online tools that you might find. Note that some online tools might exhibit problems, like not preserving transparancy, and most (that I’ve found) produce results that are just a bit soft for my tastes.

I suppose to a non-discriminating eye all of the above icons might look about the same — and I’ll admit that they do, to a degree. But there are subtle differences in sharpness that can be critical in improving readability on small devices. (if stacked on top of each other, and toggled A/B, the differences are more apparent than as presented here side-by-side)

(Check the ‘space’ between the eye-outline and the eye-brows, that’s a good place to look for the differences)

For my tastes, it’s still hard to beat Photoshop Bicubic Sharper. Though in fairness to ImageMagick ( there are a dozen different filters, and endless combinations of other possible effects (notably unsharp, but also windowing functions, linear vs perceptual colorspaces, etc, on and on), so there is a lot of potential there.

This post is just to share some initial comments/experiences, fwiw. Perhaps in a later post, if there’s any interest, I’ll post my Photoshop and/or ImageMagick scripts (no great rush though, as many others have posted similar scripts, easily found if you search for them).

[edit] Also wanted to mention another online tool: as well as a link to the post with the photoshop script.

Leftover bits = promotional opportunity

There was intended to be a little ‘easter egg’ (sort of a bonus feature) in the character selection screen of Eggheadz Bounce. On the rear wall would be a generative painting – one that could produce an endless variety of ‘mountainous landscapes’. (well, “endless” within certain bounds – technically, they’d all be unique, though they’d all be somewhat similar)
Eggheadz Gallery
However, it’s not going to make the final release – primarily for performance reasons. It’s too tricky trying to figure out what level of detail will work for each of the vast array of devices out there. So, rather than make any one of them suffer just for the sake of this silly easter egg, it was decided to leave it out entirely.

But all is not lost… I’ve repackaged it into a standalone ‘art app’, and will release it as a freebie promo device.