ROBOT-SB dev blog – bitmap fonts

One of the components that I needed for ROBOT-SB was bitmap font support.

I didn’t need anything elaborate – in fact, given the low-resolution that I’m working at, I hardly needed a proper “library” at all, just a few “helper functions” might have done the job.

So I began by exploring extant solutions, hoping to avoid reinventing the wheel if I could find something suitable already out there.

But the libraries that I found were either 1) outdated, and I had no desire to spend time updating someone else’s code that might not end up being what I needed anyway; or 2) a bit too elaborate, in that they weren’t focused on the particular issue that I needed to solve, primarily: proper pixel alignment.

I don’t need variable spacing, or kerning pairs, or tools for converting True-Type fonts, or styling support, or animation, et cetera. Primarily because features of that type tend to move characters around in a way that is not pixel-perfect. (or, at least, would have required a bit of patching to ensure pixel-perfect positioning in all cases)

So, in the end, I reinvented a toy-sized wheel, and I am oh so proud to now present you with.. wait for it… tah dah… SimpleBitmapFont!

I’m pretty sure that letters that look like fire are still cool, right?

You can get it from github.

Please note that it is definitely not intended as the be-all-end-all implementation of bitmap fonts! I’ll leave that to others with grander aspirations. But feel free to use it as a jumping-off point if it seems more-or-less appropriate for your particular project.

It is about as minimal as I could practically make it, something that just “gets the job done,” but not much more. (The version I’m using in-game is a bit different, with a few more specialized helper methods, but they aren’t of general-use enough to include.)

ROBOT-SB dev blog – get/set performance

Performance. Benchmarking. Blech! There is perhaps no more controversial subject. But I’ve got a little tidbit of info to share on the subject, so let’s dive right in, shall we?

Obligatory Disclaimer

There’s a wealth of information out there on the dangers of optimizing, and why you shouldn’t bother unless you’re sure it matters. In fact, you might even make things worse if you just operate on assumptions rather than actual benchmarking.

So my intent in sharing this little tip is NOT to suggest that anyone should just use it blindly. Rather, just file it away for possible consideration in case you’re faced with an actual scenario that benchmarking has indicated this specific problem.

Also: This whole discussion only applies to the current build of Corona SDK as of this writing, its specific implementation of internal data structures, and its specific version of Lua (5.1.5 as of this writing). Any of that could change without warning, at which point this discussion may be rendered useless.

Get/Set Performance of Display Objects

It’s not widely-documented or well-explained, but accessing the properties of a Corona DisplayObject is a more intensive operation than accessing a simple Lua table element. For example, the difference between this..:

..and this..:

But don’t just take my word for it, benchmark it for yourself, on your own devices.

Get the entire source code here.

Results on a Nexus 7 2012:

(snip)
I/Corona ( 540): Platform: Nexus 7 / ARM Neon / 5.1.1 / NVIDIA Tegra 3 / OpenGL ES 2.0 14.01003 / 2018.3205 / English | US | en_US | en
(snip)
I/Corona ( 540): OVERHEAD : 29.62400000
I/Corona ( 540): TABLE GET : 38.23500000
I/Corona ( 540): TABLE SET : 31.54000000
I/Corona ( 540): TABLE GET/SET INC : 41.15600000
I/Corona ( 540): DISPOBJ GET : 156.66800000
I/Corona ( 540): DISPOBJ SET : 905.53800000
I/Corona ( 540): DISPOBJ GET/SET INC : 1193.00400000
I/Corona ( 540): DONE

Results on a Nexus 7 2013:

(snip)
I/Corona (13231): Platform: Nexus 7 / ARM Neon / 5.0.2 / Adreno (TM) 320 / OpenGL ES 3.0 V@95.0 AU@ (GIT@Ia6306ec328) / 2018.3205 / English | US | en_US | en
(snip)
I/Corona (13231): OVERHEAD : 15.04500000
I/Corona (13231): TABLE GET : 19.25700000
I/Corona (13231): TABLE SET : 20.99700000
I/Corona (13231): TABLE GET/SET INC : 25.84800000
I/Corona (13231): DISPOBJ GET : 156.92200000
I/Corona (13231): DISPOBJ SET : 1112.64000000
I/Corona (13231): DISPOBJ GET/SET INC : 1472.83900000
I/Corona (13231): DONE

Actual results will, of course, vary from device to device. Note, for example, that my poor Nexus 7 2013 has been so heavily used that its overall performance has deteriorated, to the point where my 2012 outperforms it, on this test at least.

Still, as a generalization, and apparently regardless of the specific device, a “get” from a DisplayObject takes about 5X longer than a simple table access, while a “set” from a DisplayObject takes about 30X longer than a simple table access.

The root of the matter is that DisplayObjects are fundamentally different things than plain-old Lua tables. All the extra work necessary to drill through the DisplayObject table wrapper, through its metatable, into the proxied internal userdata representation of the display object, then push the result back onto the stack for Lua, adds up to a measurable performance difference.

(I don’t claim to know the exact internal implementation of Corona SDK, but it doesn’t matter – all that really matters is that there is a measurable difference between the two different types of access.)

Reducing Accesses to DisplayObject Properties

So, the first part of the tip is just to reduce access to a DisplayObject’s properties to the extent you can, using “obvious” techniques.

At this stage, however, we need something more “practical” to benchmark than just raw access as above. The results above are interesting as indicators of where performance might be gained, but are too simplistic to be of direct use.

In other words, it’s all well and good to claim that some micro-statement performs better than another, but how then to apply it practically? Thus, it’s useful when benchmarking to test something of just-enough complexity (without going overboard and confusing what’s being tested) to actually reveal whether one “whole solution” is better than another.

So, let’s consider the following code which is intended to “bounce” an object around the screen borders:

That’s a lot of get/set operations directly on rect.x and rect.y, which could be reduced by instead doing something like:

Note that it isn’t strictly necessary to alias “rect.dx” here, as that access isn’t necessarily a problem – Lua’s hash-table access performance is really good. So, while repeated use of any table element might benefit from local aliasing (and this is a common performance technique in Lua), it is particularly true for DisplayObject properties (the topic herein).

Is It Worth The Trouble?

Well, that’s for benchmarking to decide, of course! If it’s just a single DisplayObject, then who cares? There’s not enough difference here to add up to anything measureable.

But if you have, say, a thousand such objects, each updating 60 times per second, and you could perhaps also use those already-aliased values for collision detection or something else, then the cumulative effect might add up to something worth caring about.

Further Reducing Accesses to DisplayObject Properties

Now we come to the “non-obvious” (perhaps) techniques.

It turns out that the difference between plain-old table access and DisplayObject property access is so large in relative terms (again, roughly 5X for get, 30X for set) that there exists an opportunity to “spoof” or “proxy” the DisplayObject’s properties with plain-old table elements, and still come out a winner.

That was perhaps confusing, let me try to explain with code instead. Taking the former “bounce” example, we can eliminate the “get” of rect.x|y entirely by creating and maintaining our own local copies of x|y instead (calling them “xp” and “yp”):

Get the entire source code here.

So, what did we do? We traded a get of rect.x, for a get/set of rect.xp.

Unfortunately we can’t get rid of the set of rect.x (which is the larger of the two performance issues) because eventually we need to actually update the rect’s position. I wish Corona SDK offered something like a “moveTo” method – an absolute version of the existing relative “translate” method. That would be worth comparing!

But, it is still a net performance win because a get/set on a plain-old Lua table element takes less time than a get on a DisplayObject property.

Results on a Nexus 7 2012:

(snip)
I/Corona (16543): Platform: Nexus 7 / ARM Neon / 5.1.1 / NVIDIA Tegra 3 / OpenGL ES 2.0 14.01003 / 2018.3205 / English | US | en_US | en
(snip)
I/Corona (16543): OVERHEAD : 24.74000000
I/Corona (16543): DISP OBJ XY : 2430.85200000
I/Corona (16543): DISP OBJ XPYP : 1938.96100000
I/Corona (16543): DISP OBJ VIEW XPYP : 2020.21300000
I/Corona (16543): DONE

Results on a Nexus 7 2013:

(snip}
I/Corona (23013): Platform: Nexus 7 / ARM Neon / 5.0.2 / Adreno (TM) 320 / OpenGL ES 3.0 V@95.0 AU@ (GIT@Ia6306ec328) / 2018.3205 / English | US | en_US | en
(snip)
I/Corona (23013): OVERHEAD : 13.45800000
I/Corona (23013): DISP OBJ XY : 2872.28400000
I/Corona (23013): DISP OBJ XPYP : 2378.93700000
I/Corona (23013): DISP OBJ VIEW XPYP : 2417.45000000
I/Corona (23013): DONE

Further Exploration

I mentioned above my wish for something like a “moveTo” method on DisplayObjects, an absolute version of the existing relative “translate” method – essentially a “setXY(x,y)” method. There would likely be some performance tricks that could be wrangled out of such a method – it would at least be worth comparing to other techniques. Still, if your total usage is as simple as the “bounce” code above, it might be worth trying to substitute in a relative obj:translate(obj.dx, obj.dy) and benchmark it.

This is just offered as food for thought if you were serious about taking the “bounce” benchmarking to its logical conclusion. But it’s beyond the scope of what I intended to cover here. Essentially, you’d just want to compare:

But I’ll leave the actual benchmarking a homework assignment for the reader -if the reader is interested enough to try it. Basically: Are two reads and a function call better than two writes? Suffice to say that “translate(1,1)” will handily beat “x=x+1; y=y+1” -type code, but then you’ve got the bounce logic to factor in – assuming you want to fairly benchmark and produce same results as the other examples – and you’d potentially lose the ability to reuse those “spoof” values elsewhere.

Wrapping Up

Granted that we’re only talking about a ~15-20% difference in the second example. There is nothing earth-shattering here.

So, whether or not there’s any practical way to apply this tip would require benchmarking of specific situations.

The more you could potentially reuse and take advantage of these “spoofed”/”proxied” values, then the more you could potentially gain over repeated DisplayObject property access.

Also note that this applies not just to x|y, but any DisplayObject userdata property. Say, for example, you needed (for whatever reason) repeated access to an object’s .width (and assuming that it is unchanging), then spoof it into rect.myWidth=rect.width once, then use rect.myWidth thereafter.

But the tip from the first example – using aliases in the more “obvious” manner – remains a good general practice, regardless. (though don’t over-do it – ie, no need to alias something only accessed once)

ROBOT-SB dev blog – mines, rocks and difficulty progression

This week saw the introduction of mine-laying enemies:

The mine frequency/density has been dialed up in that clip just for testing purposes – it’s a bit too much to handle with a base/un-upgraded ship.

Aside, $0.02, fwiw: I believe that features should only be added to a game when there’s a rational justification for them. For example, not every app needs a match-3 mini-game in it “just because you can”. Raw feature-count alone shouldn’t drive development – everything should serve a purpose, otherwise it’s likely to end up being just a cobbled together collection of random ideas, rather than a cohesive whole.

The mines will serve two purposes:
1) to provide additional variety in the early game
2) to provide additional difficulty in the later game

Mines are essentially the opposite of bullets. Bullets come at you fast, but they’re relatively small and easy to dodge. Mines come at you slowly, but are larger, harder to dodge. Mines tend to fill up the screen in a more-persistent way than bullets. But since they appear at current enemy location, there may not be much time to prepare for them.

Mines are also a bit like rocks, in that they’re bigger/slower than bullets, though the rocks serve (primarily) a slightly different purpose: to discourage the player from just “camping” on the edge of the screen where it otherwise might be marginally safer. Player has plenty of time to see rocks coming from top-of-screen and decide if edge or center is a more-attractive path:

Hey, that was some fancy maneuvering, even if I do say so myself!

But back to mines, and difficulty progression…

The general way the difficulty progression works is: At the start of each wave, some overall decisions are made regarding the type, character and quantity of enemies that will spawn. Wave content isn’t predefined, but it isn’t purely random either. Essentially, the farther along you get, the harder the wave will be.

Wow, what a novel and unique idea!!! (IOW: duh, why would I even bother writing that?)

Early on, mine-laying enemies will occur every few waves, but it’ll be an all-or-nothing decision. That is, bullet-shooting enemies won’t also occur with mine-laying enemies in the early waves. The combination of the two simultaneously seems too challenging for the early waves. But at least you’ll have seen the mines in small numbers before you’re spammed with them later.

Later on, both types of enemies will be allowed to occur simultaneously. Beyond some “threshold” wave (which has yet to be properly determined through playtesting) a spawned enemy might be either a bullet-shooting or mine-laying variant. This has the effect of filling the screen with both fast- and slow-moving projectiles (in addition to the enemies themselves, which must also be avoided) which is quite a bit more difficult to find a safe path through than either one by itself.

The wave generation is essentially just a set of probability and distribution tables, then once set up it’s effectively all data-driven til the next wave. But it functions roughly like:

I don’t think that I’ll need to allow a single enemy to be both a shooter and miner simultaneously. The code is structured to easily allow it, but I’m holding off until playtesting can determine if it’s going to be needed, perhaps at some “very-late” stage of the game (for hard-core, fully upgraded players).

But it’s too early to tell yet – I’m not (yet?) good enough at my own game to playtest it that far!

ROBOT-SB dev blog – “pixel perfect” part 3

Welcome to the third installment of the “pixel perfect” series. This is the one where I’m supposed to quit stalling and actually talk about the subject rather than more boring preliminary background material (BPBM).

What?! You don’t like BPBM?!?! Yeh, ok, I’m with ya. Though, by the way, there’s plenty of BPBM in part 1 and part 2 if you’re into that sort of thing, in case you missed it.

Feel free to just download the source for example 1, example 2 and example 3 if you’re the type who does better studying actual code and would rather skip all the reading.

Now, on to the TLDR;…

First, a quick reminder of the goal: to create a “low-resolution retro blocky pixel” configuration for Corona SDK that displays properly and without aliasing artifacts on a wide variety of target devices, resolutions and aspect ratios.

Specifically, for my purposes, the “content resolution” should be 160 units wide, by whatever number of units high needed to match the device aspect ratio:

Note that there’s nothing “magical” about 160 specifically, other than it’s convenient for this project. It’s similar to Game Boy resolution, and a nice-enough divisor of common mobile device resolutions, that it was worth adopting, at least as a baseline. So that’s the dimension I’ll use for the rest of this discussion.

(Besides, if I tried to keep this purely theoretical and avoid selecting a specific resolution, then I’d probably never get anywhere with regard to presenting an actual working implementation. Just endless BPBM. So, adapt this discussion to other specific dimensions if/as you see fit.)

In previous installments I mentioned that something, somewhere will have to be “flexible” in order to accomplish this goal across the wide variety of devices. So, in this installment I’ll present two approaches: a simpler one, that “mostly” accomplishes the goal, with perhaps-acceptable compromises; and a more complex one, that does accomplish the goal, though with additional considerations that will have to be dealt with in program code.

Throughout this discussion I’ll be assuming portrait orientation just so I can adopt a single set of terms. But all of these concepts will apply similarly to landscape orientation.

The Simple Approach

If you carefully apply the foundations (ie, the BPBM) covered in the first two installments, then this approach can work pretty good under some conditions, and is so simple as to almost not warrant any discussion at all! But, it is not “pixel perfect”, except on a limited range of devices. What it is, actually, is “pixel-acceptable-but-does-preserve-fixed-content-width”. However, there are aspects of it that will apply to the more complex solution, so it’s worth covering first and explaining a bit before presenting something more complex.

“Wait just a gosh-darned minute,.. so is this so-called ‘simple approach’ just another sneaky way of wasting on our time on more BPBM?!?!”

Good for you, you’re starting to catch on.

Here’s the config.lua file:

And main.lua (for the on-screen results presented a bit below):

Or, grab the entire source code for this example.

The only thing even a little bit “tricky” in there is calculating the height based on the device’s aspect ratio.

(I’ve have been doing aspect ratio height calcs since my first encounter with Corona SDK, even when using more-typical content dimensions, just to prevent screenOriginY and contentCenterY issues, so it has long since lost it’s “magic” for me — it would be helpful if you grok it well-enough that it ceases to seem “magical” to you too, at least by the time we reach the complex approach, just so that it can be taken for granted and ignored.)

But another thing worth noting is that “left”/”top” alignment is used rather than “center”/”center”. This is done to assure that the alignment of the content coordinate grid at least begins (at 0,0) in alignment with the hardware pixels. It’s not so much an issue for the x-axis, which we hold constant in this case, but may come into play with the y-axis…

It’s important to note that config.lua treats width and height as integers. This isn’t documented anywhere, but is easy enough to test: just “request” a content width of say 321.4 or a content height of 478.2 in config.lua, then report in main.lua what your actual values are for display.contentWidth|Height – they’ll be integers. display.actualContentWidth|Height are where you need to turn for the final resulting and potentially fractional* values.

So, the use of math.floor() on the calculated height assures that if the aspect ratio math doesn’t work out to an even integer (as is common with 4:3, 5:3, 16:9 or any other aspect ratios with repeating decimals) then we’ll always specify “just under” the exact fractional height, which will cause a “tiny bit” of vertical letterboxing.

(btw, all of this works just as well if the calculated height does happen to work out to be an even integer on devices with simpler aspect ratios such as 3:2)

Any residual fractional-pixel letterboxing will manifest itself as a difference between display.contentHeight and display.actualContentHeight. What we don’t want is any of that residual fractional-pixel letterbox height affecting the pixel alignment, and by default yAlign=”center” will split that fraction among equal top and bottom letterboxes*, causing a non-zero display.screenOriginY.

By using yAlign=”top” it will force any residual fractional-pixel letterboxing to all occur at the bottom of the screen, maintaining display.screenOriginY=0, and preserving the origin of content-versus-device pixel alignment.

* Maybe in a subsequent post I’ll delve into some other “weirdness” that crops up with Corona SDK, perhaps due to internal precision or rounding issues, if you ever attempt to “fact-check” the various display.* metrics with actual hand-calculated math. Things just don’t add up right. It’s definitely relevant, and factors into the solution presented here, but it’s pretty esoteric stuff. Suffice to say there are sufficient reasons for preferring “left”/”top” alignment here.

Advantage:
display.contentWidth = display.actualContentWidth = target width of 160, and is already pixel-perfect on a wide-variety of devices with 160N widths (where N is an integer), fe: 320×480, 480×854, 640×960, 800×1280

640x results, where retro-pixels are all consistently 4×4 hardware pixels (cropped vertically just to save space) click for full-size:

Disadvantage:
Will have apparent non-square pixels when nearest-neighbor sampling is applied on devices with width that are not 160N, fe: 400×800, 720×1280, 768×1024

720x results, where retro-pixels are inconsistenly 4×4, 4×5, 5×4, 5×5 hardware pixels (cropped vertically just to save space) click for full-size:

However, all is not as bad as it appears. Unless you’re actually using something like this QA pixel grid, which is intended to highlight problems, it’s actually quite difficult to see the inconsistent retro-pixels on-device once the hardware resolution approaches 800 or so (depending on display size, thus DPI). Pixel density beyond that tends to obscure these tiny irregularites.

And since many of the more-popular lower-resolution devices (below 800, with low DPI’s) are covered by the 160N resolutions, perhaps the irregular pixels might be something you could just ignore.

The Complex Approach

I know what you’re saying: “But I don’t want any non-square retro-pixels, ever!” Well then, something else is going to have to give, because 160 is simply not a common divisor among all the various devices.* That’s what the complex solution addresses.

* I believe that since the odd 1125×2436 iPhone X, the greatest common divisor across all various device widths is now 1. I have no desire to use UV coordinates as content dimensions – but try it if you like, it’s “fun”.

The difference between the approaches is that instead of keeping 160 as a fixed width as the simple approach did, the complex approach will allow a “160-or-a-bit-more” width, in order to obtain an integer content scale ratio against the device pixels.

This implies that the rest of the actual app’s code must be “flexible enough” to deal with the varying content width (as well as height). This is the trade-off, and may or may not present further issues. It’s really no more complicated than “traditional” letterboxing concerns, though the “source” of the letterboxed dimensions are under our control.

For example, if the app presents a scrolling world, then it probably won’t present much of an issue, just allow a bigger “viewport”. But if the app “depends” in one way or another on a known/fixed width, then it’ll either have to adapt to it or “ignore” (render useless) the extra area. (more on identifying that extra area later)

Remember that there’s really no concern about height – that’ll still be calculated from the device aspect ratio. The consideration can be limited to device width…

Consider the 768×1024 iPad. It would be possible to get perfectly square retro-pixels with 4×4 hardware-pixels if content width were specified as 192. (768/192=4).

Similarly, a 720×1280 device could achieve perfectly square retro-pixels with 4×4 hardware-pixels if content width were specified as 180. (720/180=4)

Similarly, a 1080×1920 device could achieve perfectly square retro-pixels with 5×5 hardware-pixels if content width were specified as 216. (1080/216=5)

Do you see a strategy emerging here? Essentially we’re looking for the greatest integer divisor of device width that will produce an integer quotient >= 160 for content width.

That sounds pretty simple to implement, and indeed it is. (drum roll please, this is what you’ve probably been waiting for,.. finally!):

Note certain similarities with the simple version – namely that we use “left”/”top” alignment, and calculate the height based on the device’s aspect ratio. The reasons why remain the same – BPBM.

So, what are the practical results of using this config on a non-160N device width? You’ll end up with a content width that is slightly larger than 160, but (this part is critically important, so I hope you’re still awake) the content width that you end up with is guaranteed to be an integer divisor of device width.

The extra content width needed to obtain an integer divisor is now a sort of “virtual letterboxed width” (as opposed to the actual letterboxing of height that was calculated). So, again, repeating myself a bit, the app will need to find something to do with that extra width.

For example, on a 720×1280 device, try this code which mimics the simple example above (and demonstrates how you might account for the extra width if needed):

Or grab the entire source code for this example.

Actual results on a 720×1280 device (cropped vertically just to save space) click for full-size:

The Hybrid Approach

“What?!?! I though you said there were only going to be two!! I am sooo sick of reading all this, just give me a one-size-fits-all answer and be done with it already!”

Whoa, whoa, settle down! This section doesn’t present anything truly new. Rather it just considers a possible hybrid of the prior two approaches taken together.

Why? The worst-case scenario that I’m aware of for the complex approach, based on extant actual devices, is the iPad Pro’s 2048×2732 display. The greatest integer divisor that can be used is 8, giving a content width of 256. So that’s 96 additional content units of width to deal with – quite a bit (+60%), considering the initial desire for just 160 width.

And maybe that’s just too much.

So, based on the discussion of high-DPI devices above (in the simple approach section) maybe you’d rather use something like the simple approach in that case. Or, craft a “custom” divisor of your own choosing to reduce all that extra width. The trade-off being that you’ll lose perfectly square retro-pixels.

That is, for such high-DPI devices maybe you’d be willing to switch back to non-perfectly-square retro-pixels just to avoid such an overly-wide content area. For example, use a divisor of 12 (giving 2048/12=170.666 content width) and with xAlign=”left” you could simply ignore any fractional pixels left-over on the right side,… right?

Almost, except that this will not work exactly. Why? Because, as stated earlier, width and height in config.lua are treated as integers. There was a reason we were looking for integer quotients as well as integer divisors. So the best you could do would be to specify a content width of 171. And if you use a non-exact integer width, then the integer height you calculate from it can be expected to be similarly non-exact, follow? (use math.ceil() on width, unlike math.floor() when calculating height – just trust me on this, or experiment and prove it to yourself – it’s too esoteric to get into here, but it makes for a better approximation of the letterbox math when using integers)

That non-exact width means that, if you then reverse the math to check yourself, you won’t back a perfect integer divisor of 12. Rather, the divisor using these sample values would be 2048/171~=11.976. So you’ll still be close to an integer divisor, but not quite – probably closer than with the simple method alone, and the closer you are to integer the fewer non-square retro-pixels will result overall.

Test yourself: can you discern the 11×12 non-square pixels below, even with a grid to help? (such columns ought to occur 4 times) On-device results (cropped vertically just to save space) click for full-size:

Let’s be honest – it’s not pixel perfect, and that WAS the topic, right? So why bother? Because mobile device development is often just a series of compromises. If a few percent of the pixels are a few percent anamorphic, that just might be good enough for some uses if it solved some other problem.

So, let’s table all of the results from the complex approach’s math (at least for known common device resolutions), allowing certain specific values to be overridden when desired, and falling back on the simple approach calculations for everything else not specifically covered.

Or grab the entire source code for this example.

Note that I’m not suggesting that this approach would be any better or worse than either of the two approaches alone, I’m just throwing it out there for consideration. It might benefit some use cases, but be of no value to others.

BTW, it should be obvious for the nice 160N width devices that all of the three approaches are equivalent. That is, if the device dimensions are such that an integer pixel-perfect scale is readily achievable as is, then that is what is used. It’s only for the “weird” resolutions that extra considerations might need to come into play.

Final Thoughts

If you’ve made it this far, congratulations! I honestly cannot fathom how you survived it. That’s probably the longest blog entry I’ve ever written, or am ever likely to write. Hopefully you’ve picked up a trick or two that’ll be of benefit to your specific usage. As for me, I have just now developed chronic carpal tunnel syndrome and will suffer in agony for eternity,.. probably,.. maybe,.. you’re welcome.

Did you find this useful? Want to support future efforts? Feel free to give my apps some free promo, or make a contribution (it’ll go towards a good cause: more LEGO for my kids). Thanks!

ROBOT-SB dev blog – “pixel perfect” part 2

Let’s now return to the discussion of “pixel perfect” displays. (You might want to first read part 1 if you missed it) This is a mainly discussion for developers that might be looking to set up a display for “retro” or “low-resolution pixelated” games, however there are aspects of it (avoiding aliasing) that might have wider application.

Here’s a sneak peak at where all of this is heading – actual on-device screenshot (click to see full-size at effective 5X resolution):

But first, the bad news: If you’re expecting a one-size-fits-all solution that’ll cover every possible setup, forget it. There are just too many “weird” display resolutions out there. Something, somewhere, has to be “flexible” in order to adapt to all those various conditions, but there are several of those “potentially flexible things”, and the one you choose might not match the one I chose.

But second, the good news: In the process of describing what worked for me, you ought to be able to pick up enough foundational insight to craft something that’ll work for you. Most of the techniques are pretty simple, but they’re most effective when you really understand what’s going on “underneath the hood”.

So, for this installment, I’m going to cover two general things that should apply no matter what specific config.lua is in use. These are things you might want to think about even if you’re not setting up a low-resolution display. Next time we’ll dive into config.lua itself.

Texture Filtering

To begin with, it will be necessary to disable bilinear filtering. I touched on this topic last time, but didn’t go into much detail or the “why” of it.

Since the goal here is to set up the content dimensions to be intentionally much smaller than the device resolution, we know that any images that are rendered will have to be scaled up. With OpenGL we have the choice of what sampling method is used to accomplish that resampling.

What we want to avoid is the typical “softening” that occurs when up-sampling an image with bilinear filtering. This is caused by interpolating along the x- and y-axes (thus the term bi-linear) to produce in-between colors for the missing pixels. For example, a black next to a white will produce an in-between gray, and so forth. You’ve probably seen this effect even when using Photoshop (or other image editing software), for example @10x magnification:

See https://en.wikipedia.org/wiki/Bilinear_filtering for specifics.

Nearest-neighbor sampling, on the other hand, simply grabs the existing color from the closest* pixel in the source image, without any interpolation. This is the sampling method needed in this case.

*note that if pixel alignment is off, nearest-neighbor sampling can still produce unwanted artifacts, because the “closest” pixel may not be the one you expect.

So at the very least we must disable (set to nearest-neighbor sampling) the magnification filter, the one that is used when enlarging textures. But common practice is to disable the minification filter also, the one used when reducing textures:

Or grab the complete source for this example.

Because here’s what happens if you don’t, using a device resolution (640×960) that is 4 times the content resolution (160×240) as an example, on-device results:

Note that this example uses two copies of the same image in order to “trick” the texture cache mechanism into loading the first with bilinear filtering and the second without. The filter setting is recorded with the texture at load time, so if you attempt to alter the filter setting on an image that is still in the cache, it will appear to have no effect. (also note that this applies when loading image sheets as well)

Placing Images At Proper Coordinates

I wonder how many Corona SDK devs are aware that images with odd-numbered dimensions need to be positioned differently from images with even-numbered dimensions? (assuming the default center anchors)

For ROBOT-SB every pixel will matter, and I have a fairly balanced mix of even- and odd-sized rasters, so this is definitely something that affects me.

Why? Because the center of an image with odd-numbered dimensions falls at the half-pixel center of the centermost pixel, while the center of an image with even-numbered dimensions falls on the edge between two centermost pixels.

The condition depicted in the center scenario will badly alias when rendered, because every raster texel is misaligned with the hardware pixels. This is another scenario where using a 1:1 config.lua will help during experimenting, but this time we’ll specify an actual content dimension (the ‘standard’ 320×480) and simply require the demo to run on a device of the same dimensions.

Note that the same effect can occur with any content scale using fractional content coordinates (for example, an image positioned via physics, or during an arbitrary position transition), but it’s more intuitive to recreate the problem at 1:1 scale.

At 1:1 scale we shouldn’t, in theory, need to worry about filtering for this experiment – because a properly positioned image at 1:1 scale won’t need any filtering. In fact, we will intentionally leave bilinear filtering on in order to highlight the issue. (artifacts may still occur even with nearest neighbor sampling, as stated above and demonstrated in prior post, though they’ll be of a different character and less obvious)

Or grab the complete source for this example.

And here’s the on-device result:

Magnified 4X (without filtering) just to see it better:

Once we have a “proper” config.lua, with oversized content “pixels” (though remember: content dimensions are not really pixels, just coordinates), it’ll further be important to avoid placing display objects on fractional oversized-pixels as well. I’ll refer to these oversized content pixels as “retro pixels” – to indicate the supposed smallest addressable unit in the simulated low-resolution retro-display. For example, using the perfect 2:1 content scale from the former post, consider something like this:

Both will render fine, without artifacts, because they both properly align with hardware pixels. However, the red image violates the virtual low-resolution of the retro-display by addressing “retro-half-pixels” that should not be separately addressable.

That is, if we really were running at 160×240 hardware pixels then the red image could not be drawn centered at content coordinate [2,2] as shown without having aliasing artifacts. Proper “retro-pixels” occur only on the magenta lines in the illustration above.

And, again, note that for an image with odd dimensions (1×1), it was necessary to position it at half-pixel content coordinates, assuming default center anchoring. (as above)

That’s it for this installment. Next time I hope to get around to actually talking about config.lua.

Until then, you might want to study up on the approach Sergey Lerg presented here: https://github.com/Lerg/smartpixel-config-lua. I certainly don’t intend to claim that I’m the only who has ever tackled this topic! Sergey’s code is essentially intended to solve the same problem, just at a different resolution, so much of what is presented there will be applicable here. (I’m going to flip the problem around a bit though, essentially inverting the math, because it will better suit my particular needs.)

ROBOT-SB dev blog – pixel art update

I’ll get back to the “perfect pixel” topic soon, but I’m interrupting it to cover some more current activity – a bit of pixel art “sweetening”. These are the sort of little tiny details that I love working on.

The bonus star pickup now has a bit of “sparkle”:

Rocks have been added, and player’s explosion has been dialed up:

(also added was a time-slowdown effect at end-game – I may talk about that more in a later post)

ROBOT-SB dev blog – “pixel perfect” part 1

Welcome to the start of a dev blog for ROBOT-SB, a retro game concept I’m currently working on. It’ll be a low-resolution game, which presents some challenges, and I thought I’d start by talking about setting up the display to support it.

One of the initial challenges is getting a “pixel-perfect” display across all of the target devices, given the wide variety of display resolutions (and thus wide variety of content scaling factors, using Corona SDK’s lingo). Internally, the game should deal with low-resolution “content” dimensions which are then scaled up (and aspect-adjusted, if necessary) to fill the device pixels.

That is, I don’t want high-resolution content dimensions, faking the pixel look with “blocky” high-resolution assets. Rather, I’d like native @1x pixel-art assets like this…

…to scale up as necessary and look like this on device…

…without having to think too much about it.

Well, for starters, a “pixel-perfect” display occurs when there is a integral (integer) scaling factor (or its reciprocal, depending on how you arrange the terms) between device pixels and content coordinates. For example, imagine a 320×480 pixel device with 160×240 content, giving a 2.0 (or 0.5) scaling ratio:

At this point, a developer might be tempted to just dive right in and start tweaking their “config.lua” file (the file used by Corona SDK to set content dimensions, et al) to achieve this effect. Simply set width=160, height=240 and done, right? Well, maybe – that would at least work on an actual 320×480 device, but may not work as expected on other device resolutions or aspect ratios… yet.

Because it’s not just the scale, but also the overall alignment of those two “grids”, and letterboxing and centering and other such things can get in the way of perfect alignment. (in Corona SDK you’d see negative values for screenOriginX|Y when this occurs) So, before we jump into config.lua, let’s think about alignment by pondering a few things worth noting in the diagrams above…

Device pixels are physical things, each having a non-zero area. The coordinate of a pixel, say the pixel at [0,0], represents the entire pixel “cell”. You can speak theoretically of fractional pixel coordinates, for example when discussing aliasing effects, but they don’t actually exist. For example, there is no discrete hardware pixel at the coordinate [0.5,0.5]. The diagram above shows the device pixel coordinates at the half-pixel location, to imply that the entire hardware pixel “cell” is represented by those integer coordinates.

Content coordinates are non-physical things, mathematical abstractions derived from the underlying OpenGL view model, and represent precise zero-area locations along each axis. Fractional content coordinates do exist, unlike fractional pixel coordinates. For example, it is entirely reasonable to talk about the content coordinate [0.25,0.25], which at the scale indicated would represent the center of device pixel [0,0]. The diagram above shows the content coordinates “on” the lines because that is how they function within OpenGL.

For example, given the scale indicated above, to draw a “perfect pixel” using a rectangle (we’ll keep it simple with “pure geometry” and put off a discussion of raster images and texture filtering until later) you could do the following:


Remember that the scaling factor is 2:1, that’s why we create a rectangle with content dimensions 0.5×0.5 to get device dimensions of 1×1. Also remember that, by default in Corona SDK, display objects are positioned via their centers – content coordinate [0.75,0.75] is at the center of device pixel [1,1]. So, this will create a single “perfect pixel” at device coordinates [1,1].

Well then, what would have happened if we had instead created that rectangle centered on integer content coordinates? That’s the failure scenario, resulting in a non-pixel-perfect display, and it’s instructive to understand why:


When the device’s GPU attempts to rasterize that geometry it’s going to find that it partially overlaps four hardware pixels, and (by default) will generate a filtered image, whereby a portion of the geometry contributes to each of those four pixels.

So now let’s get away from “diagrams” and present actual on-device results. Using a config.lua where width and height are not specified will give content dimensions that match the device – and this is a great place to start experimenting because it’s “easy” to then address specific device pixels with content coordinates and test for alignment/aliasing effects.

I used this code to compare properly and improperly aligned images against the device pixel grid:

Actual device output:

And scaled up 4X (without filtering) just to see better:

An important thing to note is that simply turning off bilinear filtering is not enough to completely solve the problem (note the left side of the right-most “badSound” image). What really matters is alignment. With proper alignment (as with “goodMusic” and “goodSound”) it doesn’t matter whether you’re using nearest-neighbor or bilinear texture filtering because the image texels and device pixels match perfectly, so no filtering is even needed!

The lesson to be learned is that if you render a non-aligned “pixel” for whatever reason, then you’re going to end up with some sort of unwanted artifacts. Fortunately, now that we’re armed with a bit of understanding about how “fractional pixels” arise in the first place, we can bake the solution (or at least most of it – accounting for the hardware side of things) into our “config.lua” file. More on that next time.

Corona Quick Thought – Survey of Lua OOP Patterns

One of the issues that seems to confuse beginners is the bewildering variety of Object-Oriented Programming approaches possible with Lua. Though Lua itself has no native/intrinsic OOP facilities, it does offer all the raw features you need to implement something that acts like a traditional OOP language (to one degree or another).

As far as Lua is concerned, an “object” is merely yet another table containing methods and properties. Once this sinks in, it becomes clear that there are numerous ways you might create and populate such a table, as well as countless minor variations on each of the major strategies — thus the many and varied OOP approaches.

This is intended only to be a quick “survey” of some of the more common OOP strategies in Lua – particularly as it applies to Corona developers. It is not intended to be a “how to” (Google is your friend for that) so little explanation of the implementations will be provided. A thoughtful read and a bit of investigation should easily reveal each strategy’s “trick” though.

Implementations are provided in bare-minimum “do-it-yourself” form, but are completely self-contained and run-able as-is. Complete sample source code here.

Disclaimer: The sample code is meant for illustrative purposes only as material to accompany this “survey” — you are on your own if you insist on treating it as production code despite this disclaimer.

Those looking for a more polished and complete OOP system should look for one of the third-party libraries such as 30log or middleclass or others. Usage of any of these OOP libraries is not covered here. (though each such library likely follows one or more of these ‘patterns’ or variations thereof)

As a sample “case study”, each implementation below will consider a game scenerio, with two game “entities”: a “player” and an “enemy”, each with its own display object for visual representation, as well as some custom properties (fe name) and methods, as well as getters/setters (fe position) to show the approach for interfacing with the display object. the display object will just be a simple colored rectangle to eliminate the need for image files.

I’ll start with what I consider to be the simpler approaches, then move toward the more complex approaches.

1. Decorated Display Objects
Probably the simplest way to add custom behavior to a display object — just create a display object then add your custom properties and methods to it directly. This may tend to produce a lot of duplicated code if your objects are closely related, and doesn’t yet support the creation of multiple instances, but might be suitable if only a few unique “singletons” are needed.

2. Factory Function with Closures
This strategy typically develops out of the desire to eliminate some of the duplicated code that can occur with 1) above by supporting “instances”. The use of closures also offers an opportunity to support truly private members, though each individual instance still remains a bit “heavy” with its own unique copy of each method.

3. Factory Function with Closures and Statics
This strategy typically develops out of the desire to retain private closure members from 2) above, while reducing the “weight” of each individual instance by referencing a set of “static” function for non-closure members.

4A. Factory Function via Prototype
This strategy typically develops from prior “factory” methods just as a means to simplify populating a newly created instance with its members from a predefined prototype or “pattern”.

4B. Factory Function via Prototype (Alternate Syntax)
This strategy is equivalent to 4A above, just worded a bit differently where the prototype is declared. Those wishing to implement a “prototype” approach will often want to pass in some other “class” which serves as the “pattern”, and the syntax used to establish the “pattern” in this variation perhaps better illustrates that approach. (you may also want a “deep” copy, rather than the shallow one here – an implementation detail beyond the scope of this survey, just mentioned in passing)

5. Wrapper Class
This strategy typically develops once some need for more-elaborate inheritance is identified. At some point it may become more tedious to continue “decorating” display objects directly (as all previous methods have done), and instead treat the “view” (the display object) as separate from its “model” (the Lua class). Something along these lines is a typical first “baby step” implementation of metatable classes.

6. Metatable Class Inheritance
This strategy typically develops directly from the needs identified in 5) above. Now that a metatable-based class system is in place, it is now possible to consolidate shared functionality in a super-class and derived special-purpose subclasses from it.

7. Metatable Class Inheritence with Function Helper
This strategy typically develops out of an attempt to better “organize” the functionality of 6) above in preparation for an even more elaborate inheritance hierarchy. Some type of helper functions (as here) or a “base class” with similar functions, is used to create/extend classes instead of inline code. This helps prep for storing class names (for identification) or a reference to the superclass (for instanceOf or super calls) within this class creation “framework”. (if you’re still writing DIY code at this point, then it’s perhaps time to start seriously considering whether one of the third-party libraries mentioned up top might save you some effort)

8. Display Object Metatable Chaining
This strategy typically develops out of a need to “monkey patch” something deep within the internals of a Corona object. It is not typically seen in usage for more general-purpose OOP needs where a simpler decorator pattern would likely suffice. As such, the sample used throughout this post is really not an appropriate use for this strategy – it is provided merely to close out the metatable portion of this survey.

Conclusion
So, is this a full and complete list of all possible approaches to Lua OOP? Hah! Nope. Not even close. But it’s a pretty good start on some of the more common ‘fundamentals’ used by more elaborate approaches.

Once you grasp these ‘fundamentals’ then it becomes more obvious how you might extend them further to support other OOP concepts. (though again, as previously stated several times, there are existing third-party libraries that you might want to investigate and potentially save some development effort if you can find one that fits your needs)

I’ll suggest that there is no single best approach for all users, under all circumstances, particularly given the vast range of experience levels among Corona developers. So just select an approach that best works for you. Hopefully this survey might help a bit in weeding through those choices.

Corona Quick Thought – more on easing functions

A follow-up as prequel to the previous quick thought (regarding Robert Penner’s easing equations in Corona)

This is just intended as some casual/beginner “background material” on easing, maybe (?) helpful if you might be exploring/creating your own curves at some point.

Let’s take a step back, and ask: what IS easing? Essentially, it’s just a linear interpolation, where the interpolation factor (typically as “t”) has been “tweaked” so that it now has a non-linear response curve.

Let’s consider basic linear interpolation, which is often formulated so:

Where “a” is the initial value, “b” is the target value, and “t” is the parameter on [0..1]. When t==0, a is returned; when t==1, b is returned; when t is some intermediate value, the corresponding intermediate value is returned.

(Aside: The “(b-a)” portion is the “delta” between target and initial values. Often in animation system usage, as with Corona’s implementation, you’ll find the delta pre-calculated instead of passing the target value. Both approaches are equivalent.)

Let’s now “tweak” our lerp function to allow for easing:

So far, not very exciting. All we’ve done is add a “wrapper” around t to potentially allow for it to be altered. Our easing function will take an input value on [0..1] and return a potentially reshaped output value, again on [0..1]. Right now, all we have is a return of the original value, again producing a linear response.

Here’s the input/output curve of our current easing function:

Linear

Again, not yet very exciting. So let’s alter the easing function slightly:

Which now produces a quadratic response that looks this this:

inQuadratic

This curve, when used in an animation system, will cause slower initial movement, gradually speeding up until reaching the target at a more rapid velocity.

We can exaggerate that effect even further by simply using a cubic function (t*t*t). (or even higher order quartic, quintic, sextic/hexic, septic/heptic, etc – though there comes a point of diminishing returns for practical usage)

So far I’ve been presenting just the “in” response curve, but I’ll switch to the in/out response curves below, as I think it’s helpful to start visualizing the symmetry.

CubicSymmetry

See the code in the previous quick thought for implementation details on how the symmetry was accomplished for the in/out curve.

Essentially, the diagonal symmetry of the out curve is achieved by inverting both axes (tin and tout) using 1-easing(1-t). Then the in/out curve is achieved by scaling the in curve into the lower-left quadrant, and scaling/translating the out curve into the upper-right quadrant.

Now let’s look at the “Back” family of curves, as implemented in the previous article, which have an initial “overshoot” (or “undershoot”, depending on your perspective) resulting from differencing a polynomial of one degree from another polynomial of a different degree. That curve looks like this (when s==2.6):

CubicOvershootSymmetry

This curve is a pretty good basis for further exploration, as it has “just enough” complexity to inspire further modifications, yet not so much complexity that it hinders comprehension.

The most obvious tweak has already been addressed: alter the relative scaling between two polynomials by introducing a user-adjustable constant (“s”).

Perhaps the next most “obvious” experiment would be to change the degrees of the polynomials. For example, consider this function:

Producing this response curve:

QuadraticUndershootSymmetry

Having the animation effect of quickly moving to a bit more than halfway, then backing off a bit, then continuing on to the final value. A sort of “briefly unsure”, “moment of doubt”, “hesitant” or “reluctant” motion, if we anthropomorphize it. (as if it were saying: “Wait, did I leave the oven on at home? I’d better go back and check. No, no… I’m sure I turned it off – now I’d better hurry to make up for lost time!” :D)

You can insert that equation into the icurve() function of the prior article if you’d like to tinker further. (and, again, you can adjust the degree of overshoot by adjusting “s” to suit)

All for now.

Corona Quick Thought – a custom inOutBack easing function

I was reminded by a recent post in the forum that the calling convention for Corona’s easing functions isn’t well-documented. It’s fairly easy to “reverse-engineer” by experimenting, if you’re at all familiar with Robert Penner’s easing equations, and doing so will allow you to potentially create your own custom easing formulae.

At some point the subject should be treated with more depth, but I’m not feeling particularly ambitious today. 😀 So without further ado, here’s the aforementioned function:

Note that this function probably doesn’t exactly duplicate Corona/Penner, so I renamed it to avoid “claiming” that it did. Still, it’s mathematically very similar.

For readability, I’ve left it unoptimized to reveal its internal workings and 2t^3-t^2 origin — you can refactor and optimize further as you like. As written, it’s (intended to be) useful as a test-bed for developing new curves – just alter the icurve() function. Once you understand how all the “parts” work together, then you can inline all those unnecessary function calls.

The component in and out curves are in there as well, so if you don’t want the inOut version then just replace the call to iocurve() in the return statement with either icurve() or ocurve(). (or create separate versions of all three)

Test it like this:

Now, for the fun part: See that “s = 2.6” in there? That’s what controls “how much” overshoot occurs, and the 2.6 gives it about a 10% overshoot. If you’d like less overshoot, then reduce the value of s; if you’d like more overshoot, then increase the value of s.