I have written a very XnaXNA spritebatch like interface for drawing sprites in openglOpenGL. When begin is called the vertex data buffer is mapped to a float*. The index buffer and vertex buffer are bound in begin, and it's assumed no other drawing is done in this openglOpenGL context between begin and end. In between begin and end, "DrawSprite"DrawSprite is called. DrawSpriteDrawSprite has a bunch of overloads allowing one to draw with a scale, and matrix, a source rectangle etc. However, they all take their parameters and call "BufferSprite"BufferSprite which actually writes the sprite data to memory ( thethe x,y,z position, the x,y texture coordinates, and the rgba colour values. ) When end is called, the vertices are drawn in as few "glDrawElements"glDrawElements calls as possible.
I know it's a lot, but I'd appreciate it if someone could help me make this faster. If there is only one texture begin drawn over and over again in release mode, I get about 60fps60FPS for about 33k sprites( The SAME texture ). This only gets worse as I interleave textures. I've done a bit of profiling and omitting code, and it looks to me like BufferSpriteBufferSprite is taking the most time by far. I'd just like to know if there's anything obvious I'm not doing right, such as writing to the buffer in a poor way, or maybe I should be uploading different data to the shader.
Also, I realise a lot of openglOpenGL stuff in this is wrapped in my own classes, so if you need any specific source code I'll edit this post.