0

I use xen with a buildroot on the CPU0 and a baremetal written in C on the CPU1. The arch is ARM64.

The shared mem works with 1k of pages (an array of 4096 values).

I am trying to send data from the linux to the baremetal and the opposite.

I have written this code : Linux :

#define BUFF_SIZE 4096
#define ITERATIONS 10000


volatile char *memory = (volatile char *)xc_map_foreign_range(xc, domainid, 0x1000, PROT_WRITE | PROT_READ, (unsigned long) 40000000 >>12);

    
uint8_t *local_buf = malloc(BUF_SIZE);
memset(local_buf, 0xAB, BUF_SIZE);
int i;
double start = now();

memory[0] = 0;
for (i = 0; i < ITERATIONS; i++) {
    memcpy(memory+1, local_buf, BUF_SIZE-1);
    truc[0] = 1;
    while (memory[0]) __asm__ volatile("yield");
}

double end = now();

double elapsed = end - start;
double total_bytes = (double)BUF_SIZE * 1 * ITERATIONS; // écriture + lecture
double mbps = (total_bytes / elapsed) / (1024.0 * 1024.0);

printf("%.2f MB in %.3f sec = %.2f MB/s\n", total_bytes / (1024.0 * 1024.0), elapsed, mbps);

Baremetal :

uint8_t *local = (uint8_t *) malloc(sizeof(uint8_t) * 0x1000);
volatile char *tmp = (uint8_t *) (BUF+0);

for (;;) {
    if (tmp[0] == 1) {
        memcpy(local, (uint8_t *) tmp+1, 0x10000-4);
        tmp[0] = 0;
    }
}

I know, this is archaic ! My problem is the sync method, but with a baremetal I don't know if an other method is possible. More than 2G/s without sync but 200K/s with. I must have 5M/s at least.

I tried to use the event libs from xen but my project is complex and includes a lot of libs from xilinx.

What other method exists?

--- edit ---

I want to send ip packets with latency less than 10 ms because I will have video and audio data. Max size is 4096 bytes per packets but most of time it should be around 1500 bytes because of MTU.

5
  • you'll need to tell us much more about what kind of things you need to exchange between these two CPUs, and which latency / granularity of synchronization you're aiming for. Commented Sep 19 at 17:37
  • ok, what's the size of these packets? Commented Sep 19 at 18:41
  • ha! A simpler way here might be to actually push things through the PL, in a simulated network interface fashioon Commented Sep 19 at 18:54
  • Yes I have to try this solution. But the way with only the share memory looks simpler... Commented Sep 19 at 19:05
  • Not an answer to your question, but I suggest you look at the source code for Looking Glass "An extremely low latency KVMFR (KVM FrameRelay) implementation for guests with VGA PCI Passthrough". That description doesn't really say what it's for, which is to run games etc in a VM with a dedicated GPU and display them on the host's monitor with negligible performance loss. It's based on KVM rather than Xen, but you might get some ideas for high-speed low latency shared memory transfers from it. Commented Sep 20 at 7:38

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.