Sampling the User Controls in Neural Modeling of Audio Devices

Otto Mikkonen, Alec Wright and Vesa Välimäki

Companion page for the article in EURASIP Journal on Audio, Speech, and Music Processing.

Repository on Github

Abstract

This work studies neural modeling of nonlinear parametric audio circuits, focusing on how the diversity of settings of the target device user controls seen during training affects network generalization. To study the problem, a large corpus of training datasets is synthetically generated using SPICE simulations of two distinct devices, an analog equalizer and an analog distortion pedal. A proven recurrent neural network architecture is trained using each dataset. The difference in the datasets is in the sampling resolution of the device user controls and in their overall size. Based on objective and subjective evaluation of the trained models, a sampling resolution of five for the device parameters is found to be sufficient to capture the behavior of the target systems for the types of devices considered during the study. This result is desirable, since a dense sampling grid can be impractical to realize in the general case when no automated way of setting the device parameters is available, while collecting large amounts of data using a sparse grid only incurs small additional costs. Thus, the result helps to efficiently collect training data for neural modeling of other similar audio devices.

fig1a
a.
fig1b
b.
fig1c
c.
fig1d
d.
Fig. 1: Different parameter sampling densities δ used for the user controls. a) δ = 3. b) δ = 5. c) δ = 9. d) δ = 17.

Subjective evaluation

This section provides audio examples from the listening test. The listening test results are shown in Fig. 2.

fig2
Fig. 2: Listening test results for both targets. The asterisk (*) denotes the best performing model for each sampling density

ProCo RAT

Audio examples from the experiment are provided in the table underneath.

Anchor D3* D5* D9* D17* Dc* Reference

Pultec EQ

Audio examples from the experiment are provided in the table underneath.

Anchor D3* D5* D9* D17* Dc* Reference