MIT and Google researchers have made AI that can link sound, sight, and text to understand the world