Suppose you are a video character standing in the middle of a scene. Now, you can move on 3 (or more) axis's. As you make each movement, the scene you see changes. With each change in scene, the screen must be redrawn. The more memory available, the more the program can cache what you will see. The fastest way is to get the scene change data is from memory. When I was peripherally involved with this, VRAM was microscopic (by today's standards). But, I think VRAM is much faster for the Video Processor than system memory. Regardless, if the refresh data is not available in memory (any), it must retrieve it from media storage. If the entire program is on hard drive, it is somewhat fast. If it is reading it from a CD/DVD, it is going to be slow. Now, you can factor in the detail, complexity of the artwork, etc. All of these thing add to the amount of data which must be accessed. It is easy to see that the more video memory available, the better the performance will be.
Now, if there are game developers who want to take exception to this, I admit my knowledge is absolutely unrelated to games. I worked with some Unix guys at the VR lab at the University of Washington. We were looking at ways to maximize the use of relational databases for their projects.