piątek, 30 maja 2014

etcpak 0.3

New major version, new features:
  • Ability to create mipmaps (only POT, not in benchmark mode).
  • Optional dithering of input image.
  • Small quality improvements at basically no cost.
Image minification algorithm used for generating mipmaps is stupid simple, but it already beats the implementation in PVRTexTool:
Left: PVRTexTool, Right: etcpak
Notice the high frequency artifacts present in the PVRTexTool image, particularly near the eyes of parrots. etcpak generates smoother and more natural look. Further refinements will be able to improve the image quality even more.

Dithering basically improves the appearance of gradients or smooth areas in photos:
Left: no dithering, Right: dithering enabled
It comes at a small cost however. Here are the timings for normal compression:
$ x64/Release/etcpak.exe 8192.png -b
Image load time: 1352.646 ms
Mean compression time for 50 runs: 630.855 ms
And this is the run with dithering enabled:

$ x64/Release/etcpak.exe 8192.png -b -d
Image load time: 1312.084 ms
Mean compression time for 50 runs: 744.394 ms
Download: https://bitbucket.org/wolfpld/etcpak/downloads

czwartek, 13 marca 2014

etcpak 0.2.2

This version contains some minor performance improvements and a benchmark mode, which can be activated using the -b parameter. It will perform 50 compression passes and print out the average time for one pass. It should provide better environment for measurements, as the PNG decode is the slowest component during normal operation.

I've also made an example 8192x8192 image available for test purposes. It is based on the Carina Nebula shot from Hubble.

For comparison, here's the previous method of speed measurement, heavily influenced by the PNG decoder:
$ time etcpak.exe 8192.png

real    0m1.471s
user    0m0.000s
sys     0m0.030s
And here's the new benchmark mode:
$ etcpak.exe 8192.png -b
Image load time: 1330.949 ms
Mean compression time for 50 runs: 631.308 ms

Download: https://bitbucket.org/wolfpld/etcpak/downloads

sobota, 10 sierpnia 2013

etcpak 0.2.1

A new version of etcpak has been published today. What's new:
  • Reduced number of spawned threads and context switches.
  • Memory mapped files are used for output. This allows writing compressed data to disk during compression. The downside is that writing PVR output files no longer can be disabled.
  • 32 bit version has been discontinued. From now on only 64 bit version will be provided. It was always the recommended one to use, anyways, as it performed much better than the 32 bit one.
  • Various optimizations.
etcpak 0.2.1 is 10% faster than etcpak 0.2, with the compression time measured at 0.08 s (after deducting PNG load time).

Download: https://bitbucket.org/wolfpld/etcpak/downloads

niedziela, 7 lipca 2013

etcpak 0.2

It would appear that etcpak was programmed in quite inefficient way up until now. That's a funny way to talk about a program which was an order of magnitude faster than competing ones. And it's one of these obvious things which you wonder about afterwards, how wouldn't you think about it in the first place.

etcpak will no longer wait for compression to start until all image data is available. Data processing will now be performed simultaneously with PNG image decode process, which basically means that by the time the source image is fully loaded, we're almost done with the compression.

Some numbers.

TestTime (full)Time (minus PNG load)
etcpak 0.1 RGB1.12 s0.45 s
etcpak 0.1 RGB + alpha1.36 s0.69 s
etcpak 0.2 RGB0.83 s0.16 s
etcpak 0.2 RGB + alpha1.00 s0.33 s

This new version can be downloaded from https://bitbucket.org/wolfpld/etcpak/downloads.

niedziela, 9 czerwca 2013

Fastest ETC compressor on the planet: etcpak

My new (mobile) game has quite a lot of assets. Four different resolution sets, more than 23000 source images in each of them. After packing everything into atlases, the resulting data set is about 1106 Mpixels big. Since each pixel occupies 4 bytes (RGB + alpha channel), the raw data would roughly fit on a DVD disc.

Of course, 3D hardware supports texture compression basically since the beginning of 3D revolution, which is quite some time. In the mobile space there are two major texture compression formats. The first one (and the only one supported on iOS) is PVRTC. The second one is ETC (Ericsson Texture Compression) and it's supported by virtually all OpenGL ES devices. It's also here to stay, as ETC compression support is mandatory in OpenGL ES 3.0 and OpenGL 4.3.

Now, the problem is that compression takes time. A looong time. On a dedicated i7 CPU it takes about 3 hours to compress all my atlases to PVRTC format, using ImgTec's PVRTexTool utility. ETC is better with about half an hour, but that's still unacceptable for a quick, iterative development. There are various quality settings, you can choose between perceptual and non-perceptual processing, but even in the fastest mode the compression is still unbearably slow.

There are other compression utilities available, but they fare no better. I am aware of the following ones:
I have a test image, which is a real-life data 4096x4096 RGBA texture atlas filled up to about 87%. Since some tools load PNG files and other require PPM input, which is basically streaming raw image data from the disk, I have measured the load time of the PNG test image on my i3 540 to be 0.67 second. Every utility loading PNG image will have that time deducted from total time, even if the program reports it took longer (for example, crunch says it loads the texture in 1.029s).

ToolCommand lineTime
PVRTexToolCL 3.40PVRTexTool.exe -i atlas-base1.png -o pvr.pvr -f ETC1 -q etcfast24.71 s
ericsson ETCPACK 1.06etcpack.exe -s fast -e nonperceptual atlas-base1.ppm etc.ktx23.86 s
mali etcpack 4.0.1etcpack.exe atlas-base1.ppm . -s fast -e nonperceptual -c etc119.20 s
crunch (rg-etc1) 1.04crunch_x64 -ETC1 -fileformat KTX -mipMode none -uniformMetrics -dxtQuality superfast -file atlas-base1.png4.41 s

So, crunch is really fast, isn't it? Well, I didn't know that before I set out to write my own compression utility. And it runs circles around crunch. The compression time is 0.45 s. That's not a typo, it's 10x as fast as the fastest utility previously available. It's 50x as fast as PVRTexTool. And it has a special mode for processing alpha channel textures. Creating two ETC textures, one with RGB data and a second one with alpha channel takes 0.69 s. That's the time it takes to decompress the PNG image. And it's so fast you will be limited by HDD I/O wait.

As for the resulting image quality, my tool was never intended for production usage. And for testing during development it doesn't look that bad. Take a look.

OriginalCompressed

You can download the Windows executables (both 32 and 64 bit, but use 64 one, as it's a lot faster) from https://bitbucket.org/wolfpld/etcpak/downloads. As usual, MSVC redist is required.

[edit: new version is available]

Source code can be found at bitbucket.

środa, 24 października 2012

N900 software rendering

Some time ago I wrote a software renderer and presented the video on N73 running it. Then I ported it to N900, updated the model and lighting, but never actually published the video of it in action. Well, here it is:
I think the low FPS values (around 12) were the reason it was not published. It is due to the amount of triangles the new model consists of. The old Caesar's one was much simpler and thus rendered faster. With proper low-poly model the above animation would run with at least 30 FPS without any problems.

Fun side note. The software renderer on N900 was actually faster than running the hardware accelerated version. Well, hardware and/or drivers sucked greatly on that phone.

wtorek, 2 października 2012

CRT-like rendering on LCD monitors followup

Apparently some folks on some strange forum-like site have been wondering how the CRT effect works. Next time you should write a comment to the entry instead of relying on me watching site traffic analysis.

Anyway. I have prepared a stripped down version of the code and it should be simple enough for anybody competent to replicate the effect in his own code. As the ReadMe file says, the shaders are not optimized in any way whatsoever. Some of them are written in a blatantly bad way. But it's a good starting point for anyone interested.

Windows binary: http://team.pld-linux.org/~wolf/CRT%20demo.7z. You will probably need MSVC 2012 redistributable package.
Source code: http://team.pld-linux.org/~wolf/CRT%20demo%20src.7z