Bot a hardware UART and software "bit banging" accomplish the same thing: put high and low signals on a digital pin emanating from the microcontroller.
That signal is not suitable for MIDI, nor for RS-232; it has to be conditioned to drive the line properly.
MIDI differs from RS-232.  RS-232 is based on voltage levels, whereas MIDI is based on a current loop. MIDI is actually defined not in terms of a specification of currents or voltages, but by a reference circuit diagram.
To drive a MIDI line, you provide a current source which turns on and off a LED inside an opto-coupler on the other end.  When no current flows, this is a mark, or 1.  A zero is indicated by sourcing 5 mA of current.
You can source the current from the high side using a PNP transistor circuit. A low voltage from your controller to the base of the transistor will turn on the 5 mA; a high level will turn it off.
Many circuits you can find on the internet naively use an NPN transistor, preceded by a logic inverter, which is completely silly because if you don't already have inverters in your circuit, you need a whole new chip (which typically provides six of the inverters).
If you're driving the communication in software, the issue of inversion is moot, of course, since you can invert the logic yourself, but it makes more sense to design a circuit that can later be reused with a UART without needing an inverter.
The rate is 31.250 kbps. If you do it in software, you have to use timing loops or whatever technique match this rate as accurately as possible. This was chosen for MIDI because 1Mhz clocks are common, and 31.250 kHz is 1 MHz divided by 32.  The serial baud rates like 38400 are not nice divisors of a round MHz frequency. A 1Mhz clock has to be scaled to 24Mhz before it is divisible by 38400.
Like in RS-232 communication, when nothing is being transmitted, the line is in a 1 state. So in MIDI this means no current is flowing. MIDI uses the 8N1 format for framing individual bytes: a byte is encoded as ten bits: a start bit whose value is 0, 8 bits of data, and stop bit whose value is 1.  Thus, 3125 bytes per second can be transmitted (bit rate divided by ten).
In other words, the communication of each byte begins when current is driven to signal a zero. Then eight more bits follow which are just level changes clocked evenly from the start of the initial zero. Then the line returns to 1 for a clock period (the stop bit) before another before another byte can be transmitted. The other byte can be transmitted after an arbitrary additional pause, an that pause is not a multiple of any clock, which is why this is asynchronous serial communication.