there are two ways, really. first, try and make your vocal content as interesting, informative and simple to understand as possible. redo it if your intonation is wrong or if you come across in the wrong way. i often watch videos simply to listen to the person speaking, so mad visuals aren't something that's going to endlessly capture me. try watching a video of yours without looking at the video itself, and see if you can understand the basic premise of the video.
or, you could go down the route of making the video as creative as possible. interesting visuals through editing away to cut aways (comedy skits, artsy shots of landscapes, you name it!) can influence how your words are taken across, and your audience can be more captured by that, or have a clearer understanding of your points.
entertainment, though, is of course, subjective. you've got to think of how a viewer of yours would be entertained, and whether the people that are being entertained by your videos are the ones you want to be entertained. if not, try something new!