Skip to content

lkl tools: virtio net fd: make it posix compatible#202

Merged
10 commits merged intomasterfrom
unknown repository
Aug 16, 2016
Merged

lkl tools: virtio net fd: make it posix compatible#202
10 commits merged intomasterfrom
unknown repository

Conversation

@ghost
Copy link

@ghost ghost commented Aug 13, 2016

Rework the virtio net Linux fd implementation to make it POSIX
compatible. We do this by removing dependencies on epoll and eventfd.

Instead of using eventfd to shutdown the poll thread we use a control
pipe - closing the write side of the pipe will issue a POLLHUP and
wake-up the poll thread.

Note that this patch uses pipe2 instead of pipe in order to address a
libhijack limitation where pipe is hijacked regardless of calls coming
from the library or not. This makes virtio_net_fd Linux specific, but
we can easily switch back to pipe once the libhijack issue is resolved.

Like-wise, instead of using an epoll edge-triggered approach to
mitigate the high CPU usage on the TX side, we use a level-triggered
poll approach where the poll POLLIN / POLLOUT mask is set only when is
needed, i.e. when the read or write calls returns EAGAIN. The control
pipe is used to wake-up the poll thread and change the poll mask to
avoid race between the poll thread and LKL threads issuing reads or
writes.

Also collapse the rx and tx poll threads into a single poll thread. I
did not notice any performance degradation and reducing the number of
threads reduces the complexity.

Since the the poll thread can now be stopped reliably the patch also
removes lkl_netdevs_remove() and makes lkl_netdev_remove() available
to make the lkl_netdev_add/remove operations
symmetric. lkl_netdev_remove() can be called by the application before
lkl_sys_halt so we can also remove it from lkl_sys_halt.

Signed-off-by: Octavian Purdila octavian.purdila@intel.com


This change is Reviewable

@thehajime
Copy link
Member

Thanks @opurdila - I gave a quick try and #195 looks like alleviated (I didn't reproduce hangs so far).

I will review more carefully this PR later.

@thehajime
Copy link
Member

  • minor comments on styles can be checked by checkpatch.pl
  • isn't it meaningful if we keep the current edge-triggered poll mode as a non-default option ? it's certainly useful when the number of interfaces is large (though it still has an issue of lkl: retry writev() when EAGAIN is notified #195).
    • though (2) we don't have a way to configure multiple NICs with hijack library (non-hijacking mode can do btw)
  • it's great if somebody check how far from POSIX compliant with this update, by checking virtio-net code on FreeBSD :)

Review status: 0 of 12 files reviewed at latest revision, 6 unresolved discussions.


tools/lkl/include/lkl_host.h, line 101 [r1] (raw file):

   * Closes a net device.
   *
   * Implementation should release any resources releated to it. After

releated => related


tools/lkl/lib/virtio_net.c, line 146 [r1] (raw file):

  do {
      int ret = dev->ops->poll(dev->nd);

do we want to handle if ret < 0 ?


tools/lkl/lib/virtio_net_fd.c, line 46 [r1] (raw file):

      } else {
          char tmp;
          nd_fd->poll_mask |= POLLOUT;

missing blank line


tools/lkl/lib/virtio_net_fd.c, line 69 [r1] (raw file):

      } else {
          char tmp;
          nd_fd->poll_mask |= POLLIN;

missing blank line


tools/lkl/lib/virtio_net_fd.c, line 107 [r1] (raw file):

  if (pfds[1].revents & POLLIN) {
      char tmp[PIPE_BUF];
      ret = read(nd_fd->pipe[0], tmp, PIPE_BUF);

missing blank line


tools/lkl/lib/virtio_net_fd.c, line 162 [r1] (raw file):

  nd->fd = fd;
  /* FIXME: pipe2 is Linux specific but pipe is intercepted lib/hijack.c
   * so this is a workaround until we find a better solution */

line feed (see checkpatch.pl)


Comments from Reviewable

@pscollins
Copy link
Member

This looks like a great simplification of what was some very ugly code! FWIW with the hijack issue, you can use the workaround I had for eventfd:

struct lkl_netdev_linux_fdnet_ops lkl_netdev_linux_fdnet_ops = {        
    .eventfd = (int (*)(unsigned int, int))eventfd,     
 };

void fixup_netdev_linux_fdnet_ops(void)     
 {          
    lkl_netdev_linux_fdnet_ops.eventfd = dlsym(RTLD_NEXT, "eventfd");       
 }      

I guess this avoids the unspecified behavior of closing an fd that's being polled since it relies on the write side rather than the read side?

@liuyuan10
Copy link
Member

Reviewed 11 of 13 files at r1.
Review status: 10 of 12 files reviewed at latest revision, 8 unresolved discussions.


tools/lkl/lib/virtio_net.c, line 289 [r1] (raw file):

  virtio_dev_cleanup(&dev->dev);

  lkl_host_ops.mem_free(dev->nd);

So who free dev-nd now?


tools/lkl/lib/virtio_net_fd.c, line 84 [r1] (raw file):

      {
          .fd = nd_fd->fd,
          .events = nd_fd->poll_mask,

Is there race condition on poll_mask? It's accessed/modified by poll_thread and lkl kernel thread without synchronization.


Comments from Reviewable

@ghost
Copy link
Author

ghost commented Aug 15, 2016

@pscollins I feel that this sort of workaround is trying to solve the problem in the wrong place. We should try to fix it somehow in the hijack library - for example by checking if LKL is running and if not use native pipe instead of LKL pipe. I have it on my TODO list, and hope to have something soon.

@thehajime Unfortunately there is no POSIX edge-triggered primitive we can use. Also, there is no fundamental difference between edge and level, we can easily emulate edge with level by making sure to read / write until we get -EAGAIN. So at the moment I don't see any reason to try use edge-trigger notifications.


Review status: 6 of 12 files reviewed at latest revision, 8 unresolved discussions.


tools/lkl/include/lkl_host.h, line 101 [r1] (raw file):

Previously, thehajime (Hajime Tazaki) wrote…

releated => related

Done.

tools/lkl/lib/virtio_net.c, line 146 [r1] (raw file):

Previously, thehajime (Hajime Tazaki) wrote…

do we want to handle if ret < 0 ?

Good idea, I've added code to check and report errors.

tools/lkl/lib/virtio_net.c, line 289 [r1] (raw file):

Previously, liuyuan10 (Yuan Liu) wrote…

So who free dev-nd now?

Good catch. Although freeing dev->nd here seems wrong, it should be freed where it is allocated, e.g. virtio_net_fd.c I will keep it for now but I will add a new commit to clean this up.

tools/lkl/lib/virtio_net_fd.c, line 46 [r1] (raw file):

Previously, thehajime (Hajime Tazaki) wrote…

missing blank line

Done.

tools/lkl/lib/virtio_net_fd.c, line 69 [r1] (raw file):

Previously, thehajime (Hajime Tazaki) wrote…

missing blank line

Done.

tools/lkl/lib/virtio_net_fd.c, line 84 [r1] (raw file):

Previously, liuyuan10 (Yuan Liu) wrote…

Is there race condition on poll_mask? It's accessed/modified by poll_thread and lkl kernel thread without synchronization.

I don't think we need full synchronization. There was a race condition when we updated the poll mask from tx/rx ops at the same time, but I think we can avoid it if we use separate tx/rx variables, please see the new implementation.

tools/lkl/lib/virtio_net_fd.c, line 107 [r1] (raw file):

Previously, thehajime (Hajime Tazaki) wrote…

missing blank line

Done.

tools/lkl/lib/virtio_net_fd.c, line 162 [r1] (raw file):

Previously, thehajime (Hajime Tazaki) wrote…

line feed (see checkpatch.pl)

Done.

Comments from Reviewable

@liuyuan10
Copy link
Member

Reviewed 2 of 6 files at r2.
Review status: 5 of 12 files reviewed at latest revision, 11 unresolved discussions, some commit checks broke.


tools/lkl/lib/virtio_net.c, line 267 [r3] (raw file):

/* Return 0 for success, -1 for failure. */
void lkl_netdev_remove(int id)

I'm seeing the following error after running netperf tens of times. I doubt if it's because we called lkl_netdev_remove before lkl_sys_halt

[ 11.800000] ------------[ cut here ]------------
[ 11.800000] WARNING: CPU: 0 PID: 12 at ./arch/lkl/include/asm/io.h:81 vm_notify+0x6c/0x70
[ 11.800000] error writing iomem 0000000008000050
[ 11.800000] Call Trace:
[ 11.800000] 00007f5d5a7fb718: [<7f5d6ec6949c>] vm_notify+0x6c/0x70
[ 11.800000] 00007f5d5a7fb738: [<7f5d6e9e2aba>] __warn+0xca/0xf0
[ 11.800000] 00007f5d5a7fb788: [<7f5d6e9e2b70>] warn_slowpath_fmt+0x90/0xa0
[ 11.800000] 00007f5d5a7fb818: [<7f5d6ecb676a>] __kfree_skb+0x9a/0xb0
[ 11.800000] 00007f5d5a7fb848: [<7f5d6ecb6868>] consume_skb+0x28/0x70
[ 11.800000] 00007f5d5a7fb868: [<7f5d6ec6949c>] vm_notify+0x6c/0x70
[ 11.800000] 00007f5d5a7fb898: [<7f5d6ec67de5>] virtqueue_notify+0x15/0x40
[ 11.800000] 00007f5d5a7fb8b8: [<7f5d6ec67e32>] virtqueue_kick+0x22/0x30
[ 11.800000] 00007f5d5a7fb8d8: [<7f5d6ec89e49>] start_xmit+0x2d9/0x520
[ 11.800000] 00007f5d5a7fb938: [<7f5d6ecc7205>] dev_hard_start_xmit+0x245/0x350
[ 11.800000] 00007f5d5a7fb9b8: [<7f5d6ece5d4b>] sch_direct_xmit+0xbb/0x1a0
[ 11.800000] 00007f5d5a7fb9f8: [<7f5d6ecc7617>] __dev_queue_xmit+0x1c7/0x4a0
[ 11.800000] 00007f5d5a7fba58: [<7f5d6ecc78fb>] dev_queue_xmit+0xb/0x10
[ 11.800000] 00007f5d5a7fba68: [<7f5d6ecf57d1>] ip_finish_output2+0x1a1/0x380
[ 11.800000] 00007f5d5a7fbac8: [<7f5d6ecf77fe>] ip_finish_output+0x16e/0x210
[ 11.800000] 00007f5d5a7fbad8: [<7f5d6ec66bdd>] atomic64_add+0x2d/0x40
[ 11.800000] 00007f5d5a7fbb18: [<7f5d6ecf7a79>] ip_output+0x89/0xa0
[ 11.800000] 00007f5d5a7fbb68: [<7f5d6ecf6824>] ip_local_out+0x34/0x40
[ 11.800000] 00007f5d5a7fbb98: [<7f5d6ecf6b85>] ip_queue_xmit+0x165/0x3d0
[ 11.800000] 00007f5d5a7fbbe8: [<7f5d6ed0f2d3>] tcp_transmit_skb+0x4c3/0xa20
[ 11.800000] 00007f5d5a7fbc58: [<7f5d6ed11a64>] tcp_send_active_reset+0xc4/0x160
[ 11.800000] 00007f5d5a7fbc88: [<7f5d6ed02260>] tcp_close+0x3f0/0x480
[ 11.800000] 00007f5d5a7fbcb8: [<7f5d6ed2b477>] inet_release+0x37/0x60
[ 11.800000] 00007f5d5a7fbcd8: [<7f5d6ecae676>] sock_release+0x16/0x90
[ 11.800000] 00007f5d5a7fbcf8: [<7f5d6ecae6fd>] sock_close+0xd/0x20
[ 11.800000] 00007f5d5a7fbd08: [<7f5d6ea46946>] __fput+0x56/0x150
[ 11.800000] 00007f5d5a7fbd48: [<7f5d6ea46a8b>] delayed_fput+0x4b/0x60
[ 11.800000] 00007f5d5a7fbd68: [<7f5d6e9f3694>] process_one_work+0x154/0x380
[ 11.800000] 00007f5d5a7fbdc8: [<7f5d6e9f3a1e>] worker_thread+0x15e/0x500
[ 11.800000] 00007f5d5a7fbe38: [<7f5d6e9f8d3d>] kthread+0x10d/0x130
[ 11.800000] 00007f5d5a7fbe40: [<7f5d6e9f38c0>] worker_thread+0x0/0x500
[ 11.800000] 00007f5d5a7fbea0: [<7f5d6e9f8c30>] kthread+0x0/0x130
[ 11.800000] 00007f5d5a7fbec0: [<7f5d6e9f8c30>] kthread+0x0/0x130
[ 11.800000] 00007f5d5a7fbed8: [<7f5d6e9df316>] thread_bootstrap+0x56/0x60
[ 11.800000]
[ 11.800000] ---[ end trace 508e97b5072f224d ]---


tools/lkl/lib/virtio_net_fd.c, line 115 [r3] (raw file):

  if (nd_fd->poll_rx)
      pfds[0].events |= POLLIN|POLLPRI;

Should we inititialize events to 0?


tools/lkl/lib/virtio_net_fd.c, line 178 [r3] (raw file):

  struct lkl_netdev_fd *nd;

  nd = malloc(sizeof(*nd));

it's better to use host_ops->mem_alloc? because you are using host_ops->free when closing it.


Comments from Reviewable

@ghost
Copy link
Author

ghost commented Aug 16, 2016

I've pushed a new version is adds a few more cleanups and it also splits the changes to make them easier to review.


Review status: 3 of 16 files reviewed at latest revision, 11 unresolved discussions.


tools/lkl/lib/virtio_net.c, line 267 [r3] (raw file):

Previously, liuyuan10 (Yuan Liu) wrote…

I'm seeing the following error after running netperf tens of times. I doubt if it's because we called lkl_netdev_remove before lkl_sys_halt

[ 11.800000] ------------[ cut here ]------------
[ 11.800000] WARNING: CPU: 0 PID: 12 at ./arch/lkl/include/asm/io.h:81 vm_notify+0x6c/0x70
[ 11.800000] error writing iomem 0000000008000050
[ 11.800000] Call Trace:
[ 11.800000] 00007f5d5a7fb718: [<7f5d6ec6949c>] vm_notify+0x6c/0x70
[ 11.800000] 00007f5d5a7fb738: [<7f5d6e9e2aba>] __warn+0xca/0xf0
[ 11.800000] 00007f5d5a7fb788: [<7f5d6e9e2b70>] warn_slowpath_fmt+0x90/0xa0
[ 11.800000] 00007f5d5a7fb818: [<7f5d6ecb676a>] __kfree_skb+0x9a/0xb0
[ 11.800000] 00007f5d5a7fb848: [<7f5d6ecb6868>] consume_skb+0x28/0x70
[ 11.800000] 00007f5d5a7fb868: [<7f5d6ec6949c>] vm_notify+0x6c/0x70
[ 11.800000] 00007f5d5a7fb898: [<7f5d6ec67de5>] virtqueue_notify+0x15/0x40
[ 11.800000] 00007f5d5a7fb8b8: [<7f5d6ec67e32>] virtqueue_kick+0x22/0x30
[ 11.800000] 00007f5d5a7fb8d8: [<7f5d6ec89e49>] start_xmit+0x2d9/0x520
[ 11.800000] 00007f5d5a7fb938: [<7f5d6ecc7205>] dev_hard_start_xmit+0x245/0x350
[ 11.800000] 00007f5d5a7fb9b8: [<7f5d6ece5d4b>] sch_direct_xmit+0xbb/0x1a0
[ 11.800000] 00007f5d5a7fb9f8: [<7f5d6ecc7617>] __dev_queue_xmit+0x1c7/0x4a0
[ 11.800000] 00007f5d5a7fba58: [<7f5d6ecc78fb>] dev_queue_xmit+0xb/0x10
[ 11.800000] 00007f5d5a7fba68: [<7f5d6ecf57d1>] ip_finish_output2+0x1a1/0x380
[ 11.800000] 00007f5d5a7fbac8: [<7f5d6ecf77fe>] ip_finish_output+0x16e/0x210
[ 11.800000] 00007f5d5a7fbad8: [<7f5d6ec66bdd>] atomic64_add+0x2d/0x40
[ 11.800000] 00007f5d5a7fbb18: [<7f5d6ecf7a79>] ip_output+0x89/0xa0
[ 11.800000] 00007f5d5a7fbb68: [<7f5d6ecf6824>] ip_local_out+0x34/0x40
[ 11.800000] 00007f5d5a7fbb98: [<7f5d6ecf6b85>] ip_queue_xmit+0x165/0x3d0
[ 11.800000] 00007f5d5a7fbbe8: [<7f5d6ed0f2d3>] tcp_transmit_skb+0x4c3/0xa20
[ 11.800000] 00007f5d5a7fbc58: [<7f5d6ed11a64>] tcp_send_active_reset+0xc4/0x160
[ 11.800000] 00007f5d5a7fbc88: [<7f5d6ed02260>] tcp_close+0x3f0/0x480
[ 11.800000] 00007f5d5a7fbcb8: [<7f5d6ed2b477>] inet_release+0x37/0x60
[ 11.800000] 00007f5d5a7fbcd8: [<7f5d6ecae676>] sock_release+0x16/0x90
[ 11.800000] 00007f5d5a7fbcf8: [<7f5d6ecae6fd>] sock_close+0xd/0x20
[ 11.800000] 00007f5d5a7fbd08: [<7f5d6ea46946>] __fput+0x56/0x150
[ 11.800000] 00007f5d5a7fbd48: [<7f5d6ea46a8b>] delayed_fput+0x4b/0x60
[ 11.800000] 00007f5d5a7fbd68: [<7f5d6e9f3694>] process_one_work+0x154/0x380
[ 11.800000] 00007f5d5a7fbdc8: [<7f5d6e9f3a1e>] worker_thread+0x15e/0x500
[ 11.800000] 00007f5d5a7fbe38: [<7f5d6e9f8d3d>] kthread+0x10d/0x130
[ 11.800000] 00007f5d5a7fbe40: [<7f5d6e9f38c0>] worker_thread+0x0/0x500
[ 11.800000] 00007f5d5a7fbea0: [<7f5d6e9f8c30>] kthread+0x0/0x130
[ 11.800000] 00007f5d5a7fbec0: [<7f5d6e9f8c30>] kthread+0x0/0x130
[ 11.800000] 00007f5d5a7fbed8: [<7f5d6e9df316>] thread_bootstrap+0x56/0x60
[ 11.800000]
[ 11.800000] ---[ end trace 508e97b5072f224d ]---

I think so, I'll take a more indepth look tomorrow.

tools/lkl/lib/virtio_net_fd.c, line 115 [r3] (raw file):

Previously, liuyuan10 (Yuan Liu) wrote…

Should we inititialize events to 0?

Already initialized to 0 (missing field in struct intializer always initializes to 0).

tools/lkl/lib/virtio_net_fd.c, line 178 [r3] (raw file):

Previously, liuyuan10 (Yuan Liu) wrote…

it's better to use host_ops->mem_alloc? because you are using host_ops->free when closing it.

Good catch. This part has not been changed from the original implementation, so I think its best if we add a new commit on top to fix it.

Comments from Reviewable

@ghost ghost mentioned this pull request Aug 16, 2016
@ghost
Copy link
Author

ghost commented Aug 16, 2016

Review status: 3 of 16 files reviewed at latest revision, 6 unresolved discussions.


tools/lkl/lib/virtio_net.c, line 267 [r3] (raw file):

Previously, opurdila (Octavian Purdila) wrote…

I think so, I'll take a more indepth look tomorrow.

Should be fixed in r5, please take a look.

Comments from Reviewable

Octavian Purdila added 10 commits August 16, 2016 15:03
struct lkl_netdev has been changed to include an additional field and
thus the current casts in VDE don't work anymore. Fix this by
including struct lkl_netdev into struct lkl_netdev_vde and using
container_of instead of plain casts.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Collapse the rx and tx poll threads into a single poll thread. I did not
notice any performance degradation and reducing the number of threads
reduces the complexity.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
This allows us to identify poll errors and make it a bit more explicit
when the poll thread needs to exit. It also allows us to move the poll
thread join to virtio_net.c to simplify implementation for other
backends. It also makes the thread create and join operations are
symmetric as both are now happening in virtio_net.c

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Add a new form on interceptors that issue host system calls if LKL has
not yet started. This is to be used for those system calls that are not
easy to determine if they belong to LKL or not (such as pipe or
eventfd).

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Rework the virtio net Linux fd implementation to make it POSIX
compatible. We do this by removing dependencies on epoll and eventfd.

Instead of using eventfd to shutdown the poll thread we use a control
pipe - closing the write side of the pipe will issue a POLLHUP and
wake-up the poll thread.

Like-wise, instead of using an epoll edge-triggered approach to
mitigate the high CPU usage on the TX side, we use a level-triggered
poll approach where the poll POLLIN / POLLOUT mask is set only when is
needed, i.e. when the read or write calls returns EAGAIN. The control
pipe is used to wake-up the poll thread and change the poll mask to
avoid race between the poll thread and LKL threads issuing reads or
writes.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
There is no need to duplicate the net device operations in both
virtio_net_dev and lkl_netdev.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Now that we have a reliable way of shutting down the virtio net poll
thread we can (a) convert the close operation "return" void and (b)
remove the lkl_netdevs_remove() call from lkl_sys_halt.

This patch also makes lkl_netdev_remove() available to be called by the
application before lkl_sys_halt. This makes the lkl_netdev_add/remove
operations symmetric and easier to use by applications.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
@cemeyer
Copy link

cemeyer commented Aug 16, 2016

Reviewed 1 of 13 files at r1, 3 of 5 files at r3, 9 of 10 files at r4, 2 of 2 files at r5.
Review status: all files reviewed at latest revision, 11 unresolved discussions.


a discussion (no related file):
This is an improvement from a FreeBSD perspective, thanks! Some work still needs to be done to adapt the tun/tap logic to FreeBSD. We have tun/tap in some form, but I'm not familiar with it. Also, the ioctls are slightly different.


tools/lkl/lib/virtio_net_fd.c, line 48 [r5] (raw file):

 * BUILD_BUG_ON() were supported in LKL, I would have added
 *
 * "BUILD_BUG_ON(sizeof(struct lkl_dev_buf) == sizeof(struct iovec));"

Why not just use iovec? It is part of POSIX 2001.


tools/lkl/lib/virtio_net_fd.c, line 137 [r5] (raw file):

          return LKL_DEV_NET_POLL_HUP;
      if (ret < 0)
          perror("virtio net fd pipe read");

Shouldn't we really return HUP in the ret > 0 case (i.e. we received the ping over the pipefd)? The other cases imply poll implementation errors or some other implementation error.


tools/lkl/lib/virtio_net_macvtap.c, line 19 [r5] (raw file):

#include <net/if.h>
#include <linux/if_tun.h>

This header isn't in POSIX, for example :-).


tools/lkl/lib/hijack/hijack.c, line 158 [r5] (raw file):

HOOK_FD_CALL(recv)
HOOK_FD_CALL(epoll_wait)
HOOK_CALL_USE_HOST_BEFORE_START(pipe);

pipe is no longer a syscall on FreeBSD11+, so invoking it directly with lkl_syscall() in the HOOK_CALL_USE_HOST_BEFORE_START macro will fail.


Comments from Reviewable

@ghost
Copy link
Author

ghost commented Aug 16, 2016

Review status: all files reviewed at latest revision, 12 unresolved discussions.


a discussion (no related file):

Previously, cemeyer (Conrad Meyer) wrote…

This is an improvement from a FreeBSD perspective, thanks! Some work still needs to be done to adapt the tun/tap logic to FreeBSD. We have tun/tap in some form, but I'm not familiar with it. Also, the ioctls are slightly different.

Thanks for taking a look.

tools/lkl/lib/virtio_net_fd.c, line 48 [r5] (raw file):

Previously, cemeyer (Conrad Meyer) wrote…

Why not just use iovec? It is part of POSIX 2001.

To allow win32 support, eventually. Not for this file, but as the interface between the generic virtio_net.c and the backends line virtio_net_fd.c

tools/lkl/lib/virtio_net_fd.c, line 137 [r5] (raw file):

Previously, cemeyer (Conrad Meyer) wrote…

Shouldn't we really return HUP in the ret > 0 case (i.e. we received the ping over the pipefd)? The other cases imply poll implementation errors or some other implementation error.

We ping the pipe when poll_rx / poll_tx is set. The signal to exit the poll thread is closing the pipe.

tools/lkl/lib/virtio_net_macvtap.c, line 19 [r5] (raw file):

Previously, cemeyer (Conrad Meyer) wrote…

This header isn't in POSIX, for example :-).

The different backends line tap/macvtap/raw are definitely not posix, but the common virtio_net_fd.c should be.

tools/lkl/lib/virtio_net_macvtap.c, line 19 [r5] (raw file):

 
}
 
/* Return 0 for success, -1 for failure. */
void lkl_netdev_remove(int id)
{
struct virtio_net_dev *dev;
int ret;
 
if (id >= registered_dev_idx) {
lkl_printf("%s: invalid id: %d\n", func, id);
return;
}
 
dev = registered_devs[id];
 
ret = lkl_netdev_get_ifindex(id);
if (ret < 0) {
lkl_printf("%s: failed to get ifindex for id %d: %s\n",
func, id, lkl_strerror(ret));
 

  • sruct iovec so we can safely cast iov to (struct iovec *). (If
  • BUILD_BUG_ON() were supported in LKL, I would have added
    *
  • "BUILD_BUG_ON(sizeof(struct lkl_dev_buf) == sizeof(struct iovec));"
    _/
    static int fd_net_tx(struct lkl_netdev *nd, struct lkl_dev_buf *iov, int cnt)
    {
    static int fd_net_rx(struct lkl_netdev *nd, struct lkl_dev_buf *iov, int cnt)
    } else {
    char tmp;
     
    nd_fd->poll_rx = 1;
    if (write(nd_fd->pipe[1], &tmp, 1) < 0)
    perror("virtio net fd pipe write");
    }
    }
    return ret;
    }
     
    static int fd_net_poll(struct lkl_netdev *nd)
    {
    struct lkl_netdev_fd *nd_fd =
    container_of(nd, struct lkl_netdev_fd, dev);
    struct pollfd pfds[2] = {
    {
    .fd = nd_fd->fd,
    },
    {
    .fd = nd_fd->pipe[0],
    .events = POLLIN,
    },
    };
    int ret;
     
    if (nd_fd->poll_rx)
    pfds[0].events |= POLLIN|POLLPRI;
    if (nd_fd->poll_tx)
    pfds[0].events |= POLLOUT;
     
    if (ret == 0)
    return LKL_DEV_NET_POLL_HUP;
    if (ret < 0)
    perror("virtio net fd pipe read");
    }
     
    ret = 0;
    struct lkl_netdev *lkl_register_netdev_fd(int fd)
    {
    struct lkl_netdev_fd *nd;
     
    nd = malloc(sizeof(_nd));
    if (!nd) {
    fprintf(stderr, "fdnet: failed to allocate memory\n");
    /* TODO: propagate the error state, maybe use errno for that? */
     
    */
     
    #include <net/if.h>
    #include <linux/if_tun.h>

tools/lkl/lib/hijack/hijack.c, line 158 [r5] (raw file):

Previously, cemeyer (Conrad Meyer) wrote…

pipe is no longer a syscall on FreeBSD11+, so invoking it directly with lkl_syscall() in the HOOK_CALL_USE_HOST_BEFORE_START macro will fail.

Hmm, that sounds strange. It doesn't actually need to be a system call, we just need a symbol named pipe in one of the libraries for it to work.

Comments from Reviewable

@cemeyer
Copy link

cemeyer commented Aug 16, 2016

Review status: all files reviewed at latest revision, 11 unresolved discussions.


tools/lkl/lib/virtio_net_fd.c, line 48 [r5] (raw file):

Previously, opurdila (Octavian Purdila) wrote…

To allow win32 support, eventually. Not for this file, but as the interface between the generic virtio_net.c and the backends line virtio_net_fd.c

Still — why not use `iovec` generally, and just define `iovec` on win32 if it is missing there? Or is the intent to pass different kinds of arrays to different backends?

tools/lkl/lib/virtio_net_fd.c, line 137 [r5] (raw file):

Previously, opurdila (Octavian Purdila) wrote…

We ping the pipe when poll_rx / poll_tx is set. The signal to exit the poll thread is closing the pipe.

Ok.

tools/lkl/lib/hijack/hijack.c, line 158 [r5] (raw file):

Previously, opurdila (Octavian Purdila) wrote…

Hmm, that sounds strange. It doesn't actually need to be a system call, we just need a symbol named pipe in one of the libraries for it to work.

There is a `pipe` symbol in libc now. It's just a thin shim around `pipe2`.

Maybe I am misreading it, but it looks like the macro invokes syscall(__NR_pipe)? That won't work. (FreeBSD's pipe syscall returned arguments in some crazy way previously, anyway.) But perhaps this only matters to users of the hijack functionality.


Comments from Reviewable

@ghost
Copy link
Author

ghost commented Aug 16, 2016

Review status: all files reviewed at latest revision, 10 unresolved discussions.


tools/lkl/lib/virtio_net_fd.c, line 48 [r5] (raw file):

Previously, cemeyer (Conrad Meyer) wrote…

Still — why not use iovec generally, and just define iovec on win32 if it is missing there? Or is the intent to pass different kinds of arrays to different backends?

No, probably defining iovec for windows would work. Now that I think about it this is mainly a historical reason - this used to be the interface between the kernel and the host and since kernel iovec and host iovec could be different we introduced this new type. Now that we moved to virtio this should not be a problem anymore.

tools/lkl/lib/hijack/hijack.c, line 158 [r5] (raw file):

Previously, cemeyer (Conrad Meyer) wrote…

There is a pipe symbol in libc now. It's just a thin shim around pipe2.

Maybe I am misreading it, but it looks like the macro invokes syscall(__NR_pipe)? That won't work. (FreeBSD's pipe syscall returned arguments in some crazy way previously, anyway.) But perhaps this only matters to users of the hijack functionality.

If we have the symbol then we should be fine. The macro will use dlsym(RTLD_NEXT) to find the pipe symbol and will call it if lkl has not started yet. If lkl started it will invoke LKL's pipe.

Comments from Reviewable

@cemeyer
Copy link

cemeyer commented Aug 16, 2016

Review status: all files reviewed at latest revision, 10 unresolved discussions.


tools/lkl/lib/hijack/hijack.c, line 158 [r5] (raw file):

Previously, opurdila (Octavian Purdila) wrote…

If we have the symbol then we should be fine. The macro will use dlsym(RTLD_NEXT) to find the pipe symbol and will call it if lkl has not started yet. If lkl started it will invoke LKL's pipe.

Doh! Sorry, I was misreading it. Ignore me :-).

Comments from Reviewable

@liuyuan10
Copy link
Member

:lgtm:

I didn't see any difference in performance from my testing.


Review status: all files reviewed at latest revision, 10 unresolved discussions.


Comments from Reviewable

@ghost ghost merged commit e145a5a into lkl:master Aug 16, 2016
@ghost ghost deleted the lkl-virtio-net-fd branch August 16, 2016 21:09
@cemeyer
Copy link

cemeyer commented Aug 17, 2016

@opurdila, After this change, virtio_net_fd.c compiles fine on FreeBSD. Thanks!

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants