Arvutiteaduse instituut
Courses.cs.ut.ee Arvutiteaduse instituut Tartu Ülikool
  1. Kursused
  2. 2025/26 sügis
  3. Arvutigraafika (MTAT.03.015)
EN
Logi sisse

Arvutigraafika 2025/26 sügis

  • Main
  • Lectures
  • Practices
  • Projects
  • Test
  • Results
  • Links

Audio Visualizer

Erik Ivar Haav, Charlotte Pree, Richard Prost

This project aims to create a reactive audio visualizer inside the Unity game engine using the High Definition Render Pipeline and CSCore audio library. The visualizer should react in real-time to audio output and not require access to the audio file itself.

Links:
GitHub repository
Link to downloadable build
Preview video on YouTube


Screenshots:






Audio Processing

The raw audio signal is captured using the CSCore audio library via a WASAPI Loopback Capture. The audio signal is stored into a buffer (float[]) which then gets processed by a method that finds the audio spectrum. The spectrum itself is split into frequency bands logarithmically, so the spacing wouldn't be linear. Since the human ear is able to differentiate lower frequencies better than higher frequencies, an equal linear split wouldn't make much sense for an audio visualizer.
Before being stored, a normalization is applied to each frequency band. Without normalization, the higher frequencies would appear with a much lower strength due to the average being split across a larger range (although this is the case even with normalization applied) and thus cause an unbalanced visualization.

Normalization Types

  • Gain Normalization

Gain applies a curve that approximates the amount of energy to add to each band to make them a bit more balanced. Each band's energy is multiplied by a square root of the current band's frequency divided by the minimum frequency (to normalize the strength according to the smallest frequency band).

  • Psychoacoustic Normalization

This also applies a curve but this time, relative to the most linear frequency the human ear is tuned to (1000Hz). An exponent of 0.3 is applied to this which should compress the lower frequencies a little while boosting the higher frequencies.

  • Adaptive Normalization

Adaptive saves the average energy of each frequency band and the returned values are scaled accordingly so the amplitude of each band would look similar. This gives the most balanced visuals of the normalization types most of the time but is also prone to jumping and jittering if the dynamics vary a lot.

Beats are detected by saving the average energy of low frequencies (30Hz - 180Hz), clamping the average to a reasonable range to prevent jittering at low volumes and drowning out when there's sustained loudness, and comparing the current bass energy to the average. If the current bass energy is higher than the average, the current energy is rising and the time since the last beat is reasonably long, it then calculates the strength of the beat which can then be read by beat effect drivers.

Visualizer Bars Effect

The visualizer bars effect is the simpler of the two effects. Each visualizer bar represents one frequency band and is scaled along the vertical axis according to the energy of its band. Each object has a material with emission. The strength and color of the emission is driven by fragment position on the vertical axis. The vertical axis position is remapped from a minimum and maximum height to a range of 0 to 1 which is then used to lerp between colors. The bars also emit particles according to their height using the VFXGraph in Unity, with higher/more intense bars emitting more particles.

The ground mimics water by sampling Gradient Noise. The noise sampling of the receiving object uses world space instead of its UVs. The world positions are being tiled and offset using time plugged into a sine function for one noise and scaled time for the other noise. The two noises are then multiplied, a normal map is generated from this in the shader, giving a fairly convincing and non-repeating wave pattern.

The beat driven effect of this scene is a water ripple. The strength, duration and size of the ripple is determined by the strength of the beat. The ripple itself is a decal projector which only projects its normal on the water surface, overriding the waves' normals. This gives a fairly convincing effect, even without vertex displacement or tessellation.
The other part is shiny VFXGraph particles with trails that appear on each beat, with the intensity (both amount and velocity) being driven by the beat strength.

Hexagon Ripples Effect

The hexagon ripples effect is much more complicated, starting from the hexagon grid mesh itself. The mesh is generated at runtime hex-by-hex. Instead of UV unwrapping, each vertex of one hexagon is assigned the same UV value (in the third UV channel), the world position of it divided by the max bounds. For example, with bounds of (200, 200), the hexagon at the coordinates (100, 100) would have an UV value of (0.5, 0.5) for each vertex. At first, we tried to generate the hexagon grid mesh in Blender using Geometry nodes but it wound up being much more difficult to export the UV info than it was to generate the mesh in Unity at runtime.


The resulting mesh of 200x200 hexagons

Top-down view of the resulting UV map applied to the hexagon mesh as a base color

Once we had the mesh, we begun work on the shader. We wanted to drive the vertex displacement of the mesh using a texture. For testing, a naive single ripple texture was used. This was used to make sure the tiling and offset of the UVs was correct. Which hexes to displace was found using this texture and then the displacement was multiplied by the desired height value (DisplacementMultiplier). The vertices to displace for a single hex were found using comparison of the vertical axis height in object space, so vertices that were on the lower half of the hexagon would always stay in place. The resulting displacement was multiplied by a static high floor gradient noise, so there would be a little added variety in hexagons that were right next to each other.
The results looked really promising but they weren't dynamic at all. In order to actually get the ripples moving we had to find a different approach. One method would've been to use a second, orthographic camera to render sprites - that are instantiated according to spectrum values and animated each frame - onto a render texture (RT) and then use that RT as the ripple texture. While it worked, it didn't look smooth at all, caused jumpy movement and the interior and exterior of the rippled looked the same.

Instead we used a similar method with the added step of using Command Buffers. We used a Blit command to copy the source RT into a temporary RT through a material which reduced the values of each pixel by multiplying it with a preset fade amount. For example, with a fade amount of 0.1 each frame, each pixel of the source was multiplied by (1 - 0.1). This faded RT was then blitted back to the source RT, overwriting the existing source texture with a faded version. Then each sprite was drawn on top of the faded version, to be repeated again next frame. This gave us a smoothly fading ripple without having to duplicate sprite renderers for each step of the fade. This was also highly modifiable, since the fade shader offered us full control over the process.

The render camera was scaled to fit the hexagon grid precisely and the sprite renderer spawn points were spread evenly along the surface. We used 4 frequency bands instead of the 32 in the Visualizer Bars effect, since the effect got too busy otherwise.


Displacement with UVs as the base color

The ripples render texture after fading had been applied

This wasn't the end, however. We also created mossy stone PBR textures which we applied to the hexagon grid mesh using triplanar mapping for a magical, slightly otherworldly effect, with contrasting natural looking overgrown stone surfaces and shiny, mystical or alien looking emissive surfaces. For emission, we used the smoothness map as a base and we used a similar approach to the Visualizer Bars, where the world space height was remapped to emission strength and color. With one difference, only faces with a normal that had its vertical axis values below 0.5 had the emission applied to it. This resulted in the topmost faces of the hexagons keeping their base color without any emission, which let the mesh keep its nice grid-like appearance.

The emission with all other maps disabled/flat

A close up of the hex mesh with displacement and all detail textures applied

A close up of the hex mesh with the emissive map properly visible

For beat driven effects, this one has explosions in the background which are driven using the VFX graph and six-way smoke lighting. The smoke assets were sourced from Unity. The beat also drove the brightness of the colored sun. A much darker sun was also present in the scene so it wouldn't get completely dark when there wasn't any beat.

For tertiary effects, in both scenes, we used optional volumetric clouds, volumetric fog, screen space reflections and bloom. Although optional, screen space reflections and especially bloom greatly enhance the visuals.
  • Arvutiteaduse instituut
  • Loodus- ja täppisteaduste valdkond
  • Tartu Ülikool
Tehniliste probleemide või küsimuste korral kirjuta:

Kursuse sisu ja korralduslike küsimustega pöörduge kursuse korraldajate poole.
Õppematerjalide varalised autoriõigused kuuluvad Tartu Ülikoolile. Õppematerjalide kasutamine on lubatud autoriõiguse seaduses ettenähtud teose vaba kasutamise eesmärkidel ja tingimustel. Õppematerjalide kasutamisel on kasutaja kohustatud viitama õppematerjalide autorile.
Õppematerjalide kasutamine muudel eesmärkidel on lubatud ainult Tartu Ülikooli eelneval kirjalikul nõusolekul.
Courses’i keskkonna kasutustingimused