I use sony vegas, but the idea is the same. You take your text media which is a clip and put it on top of the video. If you want each word to come out on its own you'll have to make a separate clip for each one and time it properly. It is VERY time consuming that way. The easiest is to do an entire line at time. A next step up is to do one word and place it at the edge of the screen and place the next word next to that one etc etc until you fill the bottom and start again (you'll need multiple video tracts on top of each other for each word, but it is still easier to do not sure how your program works). That way is the best because the words don't jump too fast and people can read the entire line and the words appear when you want them to.
If you watch my surgeon simulator video in my sig you'll see what I mean. I use the single text method and the line method.
Good luck! Have fun? If you can? My video took about 6 hours to edit =)
It's worth it though!