Wednesday, 11 July 2018

etcpak 0.6

A new version of etcpak has been released. There are a couple of small changes in 0.6, but the main one is newly added support for compressing ETC2 RGBA textures.

Example compression result (only showing alpha channel):


16K x 16K image benchmark:
ETC1: 113 ms (only RGB part)
ETC2 RGB: 213 ms (only RGB part)
ETC2 RGBA: 404 ms

Tracy Profiler 0.3

A new version of tracy has been released. A short summary of the new features:


Complete list of features:

- Breaking change: the format of trace files has changed.
  - Previous tracy version will crash when trying to open new traces.
  - Loading of traces saved by previous version is supported.
  - Tracy will no longer crash when trying to load traces saved by future
    versions. Instead, a dialog advising to update will be displayed.
  - Tracy will no longer crash in most cases when trying to open files that
    are not traces. Some crashes are still possible, due to support of old,
    header-less traces.
- Ability to track every memory allocation in profiled program.
  - Allocation event queuing must be done in order, which requires exclusive
    access to the serialized queue on the client side. This has no effect on
    the rest of events, which are stored in a concurrent queue, as before.
  - You can search for a memory address and see where it was allocated, for
    how long, etc. This lists all matching allocations since the program was
    started.
  - All active (non-freed) allocations may be listed. This shows the current
    memory state by default, but can go back to any point in time.
  - Graphical representation of process memory map may be displayed. New
    allocations/frees are displayed in a bright color and fade out with
    time. This feature also can look back in time.
  - Memory usage plot is automatically generated.
  - Basic allocation information is displayed in memory plot tooltips.
  - A summary of memory events within a zone (and its children) is now
    printed in zone info window.
- Support loading profile dumps with no memory allocation data (generated by
  v0.2).
- Added ability to display global statistics of a selected zone from the
  zone info window.
- Fixed regression with lock announce processing that appeared during
  worker/viewer split.
- Allow selecting/unselecting all locks for display.
- Performance improvements.
- Don't save unneeded lock information in trace file.
- Don't save thrash in message list data.
- Allow expanding view span up to one hour, instead of one minute.
- Added trace comparison window.
  - An external trace has to be loaded first.
  - Zone query in both traces (current and external).
  - Both results are overlaid on the same histogram.
  - Graphs can be adjusted as-if there was the same number of zones
    collected.
- Read time directly from a hardware register on ARM/ARM64, if possible.
  - User-space access to the timer needs to be enabled in the kernel, so
    tracy will perform run-time checks and fallback to the old method if the
    check fails.
- Prevent connections in a TIME-WAIT state from blocking new listen
  connections.
- Display y-range of plots.
- Added ability to unload traces loaded from files. To do so close the main
  profiler window. You will return to the connect/open selection dialog.
  Live captures cannot be terminated this way.
- Zones previously displayed in zone info window are remembered and you can
  go back to them. Closing the zone info window or switching between CPU and
  GPU zones will clear the memory.
- Improved message list window.
  - Messages are now displayed in columns.
  - Originating thread of each message is now included in the list.
- You can now navigate to next and previous frame.
- Zone statistics can be now displayed using only self times.
- Support for tracing GPU events using Vulkan.
- Timeline will now display "OpenGL context" or "Vulkan context" instead of
  "GPU context".
- Fixed regression causing invalid display of GPU context appearance time.
- Fixed regression causing invalid reporting of an active CPU in zone end
  events, if MSVC rdtscp optimization was not enabled.
- Ability to collect true call stacks.
  - Supported on Windows, Linux, Android.
  - The following events can collect call stacks:
    - Memory alloc/free.
    - Zone begin.
    - GPU zone begin.
  - Zone stack trace now also displays frames from a real call trace.
  - On Linux call stack frame name resolution requires a call to dladdr,
    which in turn requires linking with libdl.
- Allow manual entry of GPU time drift value.
- Unix build system no longer shares object files between different build
  units.
  - Fixes inability to build debug and release versions of a single utility
    without "make clean".
  - Fixes incompatibility between "standalone" and "capture" utilities due
    to different set of used feature flags.
- On Windows "standalone" utility now adapts to system DPI setting.
- Optional per-call zone naming.