Thoughts on efficiency
Since a delay is part of the program, making the program run faster isn't necessarily a goal, but here are some thoughts on efficiency if you wanted to explore this further. First, I realized belatedly that I hadn't answered your question about the return value for Board::render(). In my view, you have it exactly right in the code now. Returning a reference would be an error because, as soon as the function ends and the ret variable goes out of scope, the destructor is called, rendering a reference invalid. When you return by value as the current code has it, notionally, a copy is created. (I say "notionally" because most compilers are, in fact, smart enough to implement Named Return Value Optimization (NRVO) to avoid actually making a copy.) Also, while you could allocate on the heap and return a pointer, freeing that memory now becomes another problem. For all of these reasons, I'd say that the way you have it is just right.
However, one option for a possible efficiency gain would be for the Board object to contain two copies of the board and simply keep track of which is the current view within nextRound() and render(). That way instead of reallocating a new one (and destroying one) on each call to nextRound, the program could simply use the same two vectors and simply swap them each loop iteration.