The history behind the various base addresses is as follows:
- the area starting at A0000h was set aside for video frame buffers in general (see Who set the 640K limit?)
- MDA used 4KiB starting at B0000h (out of 16KiB reserved, and 32KiB occupied in practice because of partial address decoding in the MDA)
- CGA used 16KiB starting at B8000h (see also Who decides what is the memory address that the CGA video buffer will be mapped to?)
- EGA introduced modes based at A0000h, using planes of up to 64KiB, depending on the amount of memory installed; this wasn’t backward-compatible with MDA or CGA, so a new address was chosen
I don’t know why the MDA and CGA base addresses were chosen in the B0000h bank rather than A0000h; but there is a useful side-effect to the choice: MDA and non-MDA modes can be used in parallel, which enables dual-monitor setups. Thanks to the non-overlapping display buffers, and to the non-overlapping I/O port ranges (3B0h–3BBh for MDA, 3D0h–3DFh for CGA; EGA adds 3C0h–3CFh), a variety of combinations are possible, starting with CGA and MDA in the original IBM PC (John Elliott’s “Dual-Head Operation on a Vintage PC” gives details). “Common” uses of this feature included displaying a running program on the colour screen and a debugger on the monochrome screen (even under Windows), or a CAD render on the colour screen and UI on the monochrome screen.
Aside from dual-monitor use, the combination of features provided on the MDA and CGA required different ports and memory addresses to be used. On PCs at the time, there was no central coordination of address assignment; each card in the system was equal on the bus, for both memory addresses and I/O ports, and chose which addresses it responded to. The original MDA also included a parallel port; even if dual displays weren’t envisioned, users buying a PC with MDA and later installing CGA might want to keep their MDA to connect their printer. If the MDA and CGA buffers or port ranges overlapped, the system wouldn’t behave correctly; it was probably simpler to map the adapters to separate ranges rather than add jumpers to choose which card “owned” the relevant addresses (as seen on EGA).
Dual-monitor use seems to have been an intentional decision; at least, it is mentioned in the original IBM PC Technical Reference, in the DIP switch settings for monitor type selection (the MDA setting is listed as “IBM monochrome display or both types of display adapters”; PCs with dual monitors used the monochrome monitor as their primary display). The comments on dual-adapter purchases in the internal IBM PC Q&A suggests that simultaneous dual-monitor use wasn’t a well-known feature however, at least not in 1981. Since the MDA also hosted the parallel port, PCs with both MDA and CGA weren’t unusualunheard of, but most CGA-equipped PCs would only have a single, colour, monitor (and switches configured for CGA-only operation). (A separate card providing only a parallel port was also available.)
Bear in mind that the PC wasn’t really designed as a family of computers, and didn’t have an expansion-ready high-performance API of any kind, so many of its implementation details became features, simultaneously sub-optimal and locked-in for a long time.