Absolutely text. Thumbnails are noticed BEFORE someone notices the title, so if the thumbnail already describes what the video is about, then it'll generate more clicks.
There's actually a popular YouTuber in my niche that I simply don't watch (Actualized) for the one simple reason that he doesn't include text in his thumbnails, and all his thumbnails are just pictures of his face. I tend to read titles in the thumbnails and focus less on the video title itself, so scrolling through his videos and trying to find one to watch is annoying for me because I'm used to seeing the titles on the thumbnails. It's a shame because he posts really good content but it's just harder for me to find videos.
This goes for any channel that doesn't include text on the thumbnails, I find it harder to pick out videos on their channel than the ones that make it immediately obvious in the thumbnail itself. I'm sure I'm not the only person who is more likely to click a video thumbnail if it has text in it either.
Just as an experiment to show how this works, here's his thumbnails vs my thumbnails side by side:
As an experiment, I want you to find the following videos, and time how long it takes.
On the left under his videos, find a video about how your mind distorts reality.
On the right, under my videos, find a video about how to lucid dream.
See how long it takes on both.