It's called an editor.
Rarely simple, rarely free.
You have to place, order & time the images so they become video. (and you'll have to pick WHAT pictures to use)
"Audio" does not "convert to Video", unless you are considering the waveform/metering/lissajous or some similar form of visualization of the soundwave (which might include a compositing element that reacts to the audio waveform). Still & moving pictures become video.
If you want extremely simple & free, just use WMM. If you want better options & better quality than that, pay a little to get Sony Movie Studio or Cyberlink PowerProducer, or a TMPG, Magix or Pinnacle product. You could take a chance on Lightwave or similar freeware GUI NLE, but there are hidden gotchas that route (bugs, incompatibilities, crashes, little support). While Virtualdub & AVISynth also can do those things for free (and without much of the problems of the above freeware GUI NLEs), they are neither the best at powerful edit workflows, nor easy/simple in their learning curve. In fact, they are more properly known as video PROCESSORS, rather than NLEs or Compositors.
Scott