Porting Xen Paravirtualization to
MIPS Architecture
Yonghong Song
Broadcom




                                    1
Motivation

• Broadcom XLP
  –   8 cores, 4 threads each core
  –   Out-Of-
      Out-Of-Order
  –   L1D, L1I, L2 each core, shared L3
  –   Accelerators: NET, SEC, RAID, DMA, COMP, etc.
  –   SOCs: USB, PCIE, FLASH, I2C, etc.
• Need for a software enabled virtualization solution
• Xen ported and provided as a solution




                                                        2
General Xen Usage Model




     Xen dom0               Xen domU          Xen domU
    (Mgmt App)              (Guest OS)        (Guest OS)




launch   Create, monitor, destroy


                             Xen Hypervisor


            Hardware (CPU, Memory, Disk, Net/PCI, etc)


                                                           3
Hybrid Control/Data Plane Model



                       Shared Memory




       Control Plane         Data Plane     Data Plane



                       Bare Metal Linux


                                          NET



                                                         4
Proposed Model in Xen



                       Shared Memory




          Dom0                DomU           DomU
       Control Plane        Data Plane     Data Plane



                           Xen


                                         NET



                                                        5
Outline

• CPU Virtualization (mips64r2 only)
   • Memory virtualization
   • Instruction emulation
   • Exception handling
   • Event Channel and Timer Interrupt
• Preliminary Benchmarking Results
• Summary and Future Work




                                         6
Change of Privilege Levels


       Bare Metal Mode              Virtualization Mode

          User apps                     User apps
                                          Linux


             Linux
               L                             Xen
                                              L




         : user ring     : supervisor ring          : kernel ring



                                                                    7
Address Spaces


              user space         guest kernel virtual space
GVA
               (0 – 2^40)        (0x4000 0000 0000 0000 -)




 GPA        guest 0 phys addr                    guest N phys addr



  MA                               machine memory

 Kernel code + data | shared pages with xen | … | kernel page table | free pages


0x0                                                 Size allocated to each guest
       Xen in unmapped space
                                                                                   8
Page Table Management


        guest page table               P2M table



                            GPA                    MA
  GVA



                    new guest page table



              GVA                          MA




                                                        9
Page Table Layout



                       Bare Metel Linux
         pgd
                                 pmd
                                           pte
    VA pf PMD page
                          VA pf PTE page   PA




    pgd: page global directory
    pmd: page middle directory
    pte: page table entry



                                                 10
Page Table Layout

                              PV Linux
             pgd
                                   pmd
 xkphys (MA of PMD page)                              pte
                           xkphys(MA of PTE page)   MA address




 xkphys: 64-bit kernel physical space (unmapped)
 xkphys: avoid TLB refill during page table walk
 Hardware page walker is used


                                                                 11
Instruction Emulation

• Privileged instructions in guests get trapped and emulated
• XEN trap handlers decipher the instruction and emulate
  appropriately
• A few instructions cause hardware state to change, while
  others change the shadow state
• Shadow state is maintained per virtual cpu of domains
                                    Privileged Insns
                                                       Guest
                                                        Xen
         Shadow states

           cop0 regs        mfc0      tlbp
                            mtc0
                                              tlbs
                                      tlbr
          Bookkeeping for   ei/di
          Exception prop    eret             caches
                            …
                                                               12
Hypercalls

• The service API between guests and xen
• Analogous to system calls between userspace and linux
• Used when a particular service is requested or the overhead
  of trap and emulate is high
• Implemented using the “syscall” instruction
• Sample uses: vcpu creation, request cache flush, etc

                      userspace
          syscall
                        linux
        hypercall
                        xen

                                                                13
Exception Handling

• Exceptions triggered by guests handled by xen
  – Hypercalls
  – Address error exception
  – Privileged instruction traps
• Exceptions triggered by userspace bounced into guests
  – Guests register callbacks for exception entry points such as general
    exception vector etc
  – Xen maintains shadow state to return to userspace after the
    propagated exception is handled
  – Interrupts injected into guests while the bounced exception is
    handled, retaining regular linux semantics




                                                                           14
A syscall example

                                                                     Applications

1. syscall insn                                                8. app resumes
                                                               after syscall

                  4. xen bounces     5. guest syscall
                  syscall to guest   handling
                                                                    Guest Kernel
                           priv
                           insn                         eret
                           trap
   2. control          e                                       7. xen restores
   transfers           r                                       Original state,
   xen                 e                                       Does final eret
                       t
                                       Shadow                  6. xen executes
 3. xen syscall                      architecture              Eret handler      Xen
 handler
                                        state




                                                                                       15
Event Channels

• Events: asynchronous notifications to domains (akin to
  signals in Unix)
• Event channels: abstract duplex communication channels
  (akin to sockets): <dom1, port1; dom2, port2>
• Interrupts are mapped to events
  – Intradomain & interdomain events (e.g., domU console)
  – Virtual IRQ (e.g., timer interrupts)
  – Physical IRQ (e.g., passthrough device interrupts)
• Delivered through a callback function




                                                            16
Time Management

• Time keeping in xen
  – Maintaining system time – Using XLP-specific internal global 64bit
    free running counter
  – Requesting timer interrupts: done by maintaining per-cpu timer list
    and programming the count/compare registers
• Guest OS
  – Xen clocksource: a hardware abstraction for a free running counter to
    maintain system time
     – Maintained through timestamps written by xen on a shared page
  – Xen clockevent: an interface to request timer interrupts
     – Done using the hypercall to program a single shot timer in xen




                                                                            17
Timer Interrupt Illustration


                                                                       Applications


   1. timer
    interrupt           6. guest executes
   occurs               event handler
                                                                       Guest Kernel
                      5. xen injects   7. guest
                      event into       does eret
   2. control         guest                              9. xen restores
   transfers                                             original state,
   to xen                                                does final eret

   3. xen interrupt   4. xen sets           8. xen exectues
   handler            event pending         eret handler                       Xen
   executes           for the guest




                                                                                      18
Performance Optimization

• Expose certain shadow states for guest OS to avoid
  excessive exception start/end cost
• When guest executes “wait” insn, xen tries to “wait” also to
  avoid burning cpu resources




                                                                 19
Preliminary Benchmarking Result

• XLP832: 8 cores, 4 threads each core, 1.0GHZ
  – Only 1 core, 4 threads used for measuring time
• Intel Core 2: 2 cores, 1 thread per core, 2.4GHZ
  – Not using hardware virtualization extensions
• CPU/Memory intensive benchmarks like dhrystone, eembc,
  coremark, etc.
  – 0 – 5% slowdown for dom0 compared to bare metal linux, for both x86
    and XLP
• Hackbench (a lot of system calls)
  – 2X slowdown for dom0 compared to bare metal linux, for both x86
    and XLP
• No noticeable performance difference between dom0 and
  domU on XLP

                                                                          20
Summary and Future Work

• A MIPS port of xen paravirtualization has implemented
  – MMU, exception/interrupt handling, etc.
  – Comparable performance to x86 for bare metal vs. xen
• Currently, our implementation uses xen 3.4.0 for xen
  hypervisor, 4.0.0 for xen tools, linux 2.6.32 for PV linux, so
  we need to
  – Update to latest versions of Xen
  – Submit patches upstream
• More work on I/O paravirtualization
• Ongoing collaboration with MIPS Technologies




                                                                   21
Thank You