Interviews and topics of interest to Open vSwitch developers and users, hosted and produced by Ben Pfaff. Follow the podcast as @OVS_Orbit on Twitter.

Use the to keep up with new episodes in your favorite podcast listening application. You can also follow OVS Orbit via Google Play Music, player.fm, Pocket Casts, or Apple Podcasts.

OVS Orbit invites you to propose your own episode.

Episode 75: Sneak Preview: VMware Research High Bits, with Lalith Suresh from VMware Research (Jul 16, 2021)

VMware Research High Bits is a podcast coming soon from VMware's research group. This episode is a sneak preview of episode 1, with Lalith Suresh. Lalith is a researcher in the VMware Research Group who specializes in measurement, design, and implementation of large-scale networked and distributed systems. Last year, Lalith published a blog entry with advice on how to build software systems for research, and then he followed up with a video with more details. This episode interviews Lalith to learn more about his approach to building research systems.

To learn more about Lalith's work, visit his VMware Research webpage or his personal blog Comfortably Geek, where you can also find his contact information.

VMware Research High Bits is produced by Ben Pfaff. The bumper sound in this episode is Ben biting into an apple.

Episode 74: The Systems Approach, with Bruce Davie, Larry Peterson, and Mark Twain (Apr 1, 2021)

Bruce Davie and Larry Peterson are the authors of the Systems Approach series of computer networking textbooks. Mark Twain has been called the greatest humorist the United States has produced and the father of American literature. This episode, the first collaboration among these celebrated authors, incorporates elements of the works of both [1, 2].

For more information on the Systems Approach series, visit systemsapproach.org or follow the series on substack as systemsapproach or Twitter as and @SystemsAppr. On Twitter, you can find Bruce as @_drbruced, Larry as @_llpete, and Mark as @MarkTwain.

Bruce and Larry were previously featured in Episode 73: Computer Networks: A Systems Approach.

OVS Orbit is produced by Ben Pfaff. The intro music in this episode is Drive, featuring cdk and DarrylJ, copyright 2013, 2016 by Alex. The outro music is Space Bazooka featuring Doxen Zsigmond, copyright 2013 by Kirkoid. Fanfares et Simphonies by Jean-Joseph Mouret, performed by Jean-Marie Leclair, dir. Jean-François Paillard, downloaded via imslp.org, is licensed under Creative Commons Zero 1.0 - Non-PD US. The scratch sound effect by Stumber was downloaded via freesound.org and licensed under Creative Common Attribution 3.0 Unported (CC BY 3.0). The episode as a whole is licensed as Creative Commons Attribution 3.0 Unported (CC BY 3.0).

Episode 73: Computer Networks: A Systems Approach, with Bruce Davie and Larry Peterson (Mar 16, 2021)

Bruce Davie and Larry Peterson talk about their Systems Approach series of books on computer networking. These books began with the textbook Computer Networks: A Systems Approach, published in 1996. After publishing the fifth edition in 2011, they took their book "open source" by making the text freely available on Github under a Creative Common license. Now the original book is part of a series that also includes "micro-books" on SDN and 5G, with more books in preparation. In this episode, Bruce and Larry talk about their motivations and future plans, including trying to rope your host into writing a book on Open vSwitch.

For more information on the series, visit systemsapproach.org or follow the series on substack as systemsapproach or Twitter as and @SystemsAppr. You can find Bruce on Twitter as @_drbruced and Larry as @_llpete.

OVS Orbit is produced by Ben Pfaff. The intro music in this episode is Drive, featuring cdk and DarrylJ, copyright 2013, 2016 by Alex. The bumper music is Yeah Ant featuring Wired Ant and Javolenus, copyright 2013 by Speck. The outro music is Space Bazooka featuring Doxen Zsigmond, copyright 2013 by Kirkoid. All content is licensed under a Creative Commons Attribution 3.0 Unported (CC BY 3.0) license.

Episode 72: The OVSDB Query Optimizer and Key-Value Interface, with Dmitry Yusupov from NVIDIA (Feb 27, 2021)

Dmitry Yusupov submitted a talk to the Open vSwitch 2020 Fall Conference that we weren't able to fit into the schedule. This podcast, recorded in December 2020, is based on the material that Dmitry presented in a video available on YouTube. The slides for Dmitry's talk are also available.

The abstract for this talk is:

OVSDB (management protocol RFC 7047) is a fundamental building block of modern SDN architecture based on OVN/OVS. OVSDB is the key component that is responsible for efficient scalability of large deployments, such as 1000+ node Kubernetes clusters when CNI function is OVN based.

OVSDB was originally designed to store configuration for Open vSwitch daemon, however, it was later extended to be part of OVN architecture, with added RAFT clustering support and improved scale-out capabilities, such as efficient scale-out caches.

This talk brings a question of scaling up OVSDB with introduction of OVSDB Query Optimizer using primary and alternate key indexing. We will be also looking at OVSDB with Query Optimizer as high-performance Key-Value API interface and compare it to ETCD.

You can contact Dmitry as @dmitryy on Twitter, or via email at dyusupov@nvidia.com.

Episode 71: Network Service Mesh, with Frederick Kautz and Nikolay Nikolaev (Sep 1, 2019)

Frederick Kautz and Nikolay Nikolaev are developers on the Network Service Mesh project, which provides additional networking features for Kubernetes above what is available from Kubernetes CNI networking implementations.

Episode 70: Long-Term Network Latency, with Nick Buraglio from ESnet (Jul 25, 2019)

Nick Buraglio works in research and education and service provider networking, currently at ESnet, the US Department of Energy's science network, which links sites in the United States and western Europe. In this podcast, he talks about the role of latency monitoring in managing a network.

Nick defines the latency that he's talking about:

"When most people think of latency, they think of a ping round-trip time. That's one useful data point, but we're talking about very low tolerance and high accuracy latency. You have to have a cellular or GPS clock and a very strong clock source in the system to be able to track it at this level. We use that as a very important part of how we manage our network. That's really what I'm talking about when I talk about latency."

Nick discusses the perfSONAR software for monitoring latency over time and for investigating problems as they occur.

Some basic live monitoring charts and graphs for ESnet are online at MyESnet.

You can contact Nick on Twitter as @forwardingplane or visit his blog at forwardingplane.net.

Other episodes of OVS Orbit related to monitoring include Episode 46: In-band Network Telemetry and Episode 6: sFlow.

Episode 69: User-Configurable Protocol Support for OVS, or Why Doesn't OVS Support P4? (Jun 3, 2019)

There are several challenges toward making it easy for users to add support for new protocols in OVS or, equivalently, adding P4 support to OVS. This talk, given at the Dagstuhl seminar on programmable data planes in April 2019, explains the reasons that OVS doesn't already have these features, what's changing, and likely future directions. The talk includes considerable discussion with the audience.

An early statement summarizes the message of the talk:

...I think that it's too hard to add support for new protocols and I think users should be able to do that fairly easily. Currently, it's really hard--it's hard for me in some cases, and if it's hard for me then I'm sure it's hard for everyone else.

A little later, this quote covers Ben's philosophy on P4:

Why I like P4 is because of my own personal experience with OpenFlow. At Nicira when we started out designing OpenFlow, we designed it for very much a fixed match over basically IPv4 and related fields. We knew from day 1 that that wasn't good enough, I mean, not to mention existing protocols like IPv6 that we couldn't handle, but it seemed pretty obvious that people would want to add their own. Over a couple of years, in my spare time I started tinkering with ideas for how to write a language for specifying what protocols a switch supports. It seemed like there were two possibilities that kept coming up, and yet neither one of them seemed very good. One was basically based on fixed offsets; people kept suggesting this, I think maybe even Nick McKeown suggested this at one point. I kept pointing out that fixed offsets are not going to work very well because offsets change from one packet to another. The other end of the spectrum was somebody just provides a program in some general-purpose language that extracts the headers that you want, and that also seems pretty unsatisfying because it's really hard to take a general-purpose program and look at it in terms of some of its emergent properties. You can't do much with it other than run it. I tried to come up with some languages that fit in between and then when I first saw one of the drafts of the P4 specification, I looked at it and said, "I wish I'd written this." It seems to me that it strikes a really good balance there.

The remainder of the talk covers the possible directions forward for OVS and flexible protocol support, including eBPF and AF_XDP.

Episode 68: The Faucet Controller at SC18, with Brad Cowie and Richard Sanger from University of Waikato (May 1, 2019)

Brad Cowie and Richard Sanger are members of the WAND Network Research Group at the University of Waikato, in Hamilton, New Zealand. They are both associated with the Faucet project, which develops an open source OpenFlow controller for enterprise networks.

The first part of this talk is an introduction to Faucet. The second part talks about how Faucet became involved in SCinet at SC18, the supercomputing conference held annually in Dallas. The talk includes questions from the audience.

You may wish to view Brad's slides along with the episode.

For more information on Faucet, visit the Faucet website. You can reach Brad as gizmoguy on IRC or @nzgizmoguy on Twitter.

Brad Cowie previously spoke with OVS Orbit in Episode 47: Routing a Production Enterprise Network with Faucet. OVS Orbit previously covered Faucet in Episode 45: Faucet and OpenFlow at Allied Telesis, Epsiode 33: Lightning Talks, and Episode 19: The Faucet SDN Controller.

For another take on Faucet at SC18, you can listen to Ivan Pepelnjak interview Nick Buraglio in Episode 101 of Software Gone Wild.

Episode 67: The Discrepancy of the Megaflow Cache in OVS, with Levente Csikor and Gabor Retvari from Budapest University of Technology and Economics (Apr 1, 2019)

Levente Csikor and Gabor Retvari from the Budapest University of Technology and Economics present their talk “The Discrepancy of the Megaflow Cache in OVS” at the Open vSwitch Fall Conference in San Jose in December 2018. A few days later, they visited me to have this discussion for the podcast about their work. This episode is a discussion of their work and their results.

For a synopsis of Levente and Gabor's work, please visit the OVS conference page. Slides and video of their ovscon talk are also available.

Episode 66: OVS Hardware Offload, with Simon Horman from Netronome (Mar 1, 2019)

Simon Horman has been an Open vSwitch contributor and committer since 2010. He currently works for Netronome, where his Open vSwitch work centers around hardware offload using the "tc" API integrated into the Linux kernel. This API allows users of Open vSwitch to transparently obtain better performance: when offload is enabled with a compatible network card, Open vSwitch works the same way, but faster.

The conversation includes:

Categories of NICs with hardware offload
The architecture of Netronome NICs
How the offload API works
Handling state (such as connection tracking state) in hardware offload
Limitations of hardware offload, such as memory and other resource limits
Extending hardware offload to DPDK
The possibility of classification-only offload
Offload interaction with the OVS caching hierarchy
The cost of offload
Kernel politics of the offload API
Applications for offload
Vendor cooperation across the API

Simon Horman is available on Twitter as @horms.

For more information on the offload API, you might want to listen to Episode 50, with Andy Gospodarek from Broadcom.

Episode 65: Encrypting OVN Tunnels with IPsec, with Qiuyu Xiao from UNC-Chapel Hill (Feb 1, 2019)

Qiuyu Xiao is a Ph.D. student in the department of computer science at the University of North Carolina at Chapel Hill. During the summer of 2018, he was an intern in the Open vSwitch team at VMware. This episode is a talk that Qiuyu gave at the end of his internship, describing his work on encrypted tunnels for OVN. The slides that accompanied the talk are available.

To learn more about Qiuyu's work, visit his website, or contact him via email at qiuyu@cs.unc.edu or on Twitter as @QiuyuX.

Episode 64: Introduction to OVSDB, Part 2 (Jan 1, 2019)

This episode, recorded in April 2018, was the third in a series of internal VMware tech talks about Open vSwitch. This episode is particularly about OVSDB, the Open vSwitch Database, and particularly about OVSDB from the viewpoint of the client. It talks about the C client library, including how it represents data, the usual way to work with it, and how it interacts with the OVSDB server. It also covers how the C client library supports preparing transactions to send to the server.

Part of the talk dissects and explains an OVSDB JSON-RPC transaction created by ovs-vsctl. You can see a similar transaction by running make sandbox in an OVS tree, then ovs-vsctl -vjsonrpc add-br br0 inside the sandbox. Look for the transact operation, Or look at this example, which has been put through a JSON pretty-printer for legibility.

The talk concludes with several minutes of questions. One of the questions discusses the C IDL's rendering of the AutoAttach table. You can find this at the top of the file here.

Part 1, in episode 55, covered OVSDB from the server and network protocol point of view.

Episode 63: Personalized Pseudonyms for Servers in the Cloud, with Qiuyu Xiao from UNC-Chapel Hill (Aug 29, 2018)

Qiuyu Xiao is a PhD student studying computer science at the University of North Carolina at Chapel Hill. This episode is a talk that Qiuyu gave at VMware in May. It is based on the paper “Personalized Pseudonyms for Servers in the Cloud,” by Qiuyu Xiao, Michael K. Reiter, and Yinqian Zhangyinqian, originally published in 2017 at Proceedings on Privacy Enhancing Technologies.

You may wish to follow along with Qiuyu's slides.

The paper's abstract is:

A considerable and growing fraction of servers, especially of web servers, is hosted in compute clouds. In this paper we opportunistically leverage this trend to improve privacy of clients from network attackers residing between the clients and the cloud: We design a system that can be deployed by the cloud operator to prevent a network adversary from determining which of the cloud’s tenant servers a client is accessing. The core innovation in our design is a PoPSiCl (pronounced “popsicle”), a persistent pseudonym for a tenant server that can be used by a single client to access the server, whose real identity is protected by the cloud from both passive and active network attackers. When instantiated for TLS-based access to web servers, our design works with all major browsers and requires no additional client-side software and minimal changes to the client user experience. Moreover, changes to tenant servers can be hidden in supporting software (operating systems and web-programming frameworks) without imposing on web-content development. Perhaps most notably, our system boosts privacy with minimal impact to web-browsing performance, after some initial setup during a user’s first access to each web server.

You can reach Qiuyu at qiuyu@cs.unc.edu or on Twitter as @QiuyuX.

Related episodes.

Episode 62: Generic Linux Debugging, with Ansis Atteka from VMware (Aug 29, 2018)

Ansis Atteka is a developer on the Open vSwitch team at VMware. This episode is a recording of a talk that Ansis gave at VMware in May. He covers techniques for debugging on Linux, in particular how to trace through processes using strace, trace-cmd, and other tools.

You may want to follow along with Ansis's slides.

You can contact Ansis at aatteka@ovn.org.

Episode 61: Networking with OVS at DigitalOcean, with Matt Layher and Armando Migliaccio from DigitalOcean (Aug 18, 2018)

Matt Layher and Armando Migliaccio are engineers focusing on networking at DigitalOcean, a cloud service provider. In April, Justin Pettit and I sat down with them at DigitalOcean HQ in New York City. This episode is our discussion, which ranges from how DO first began using Open vSwitch, the DO approach to network control, to scale and performance issues, upgrade strategy, and the Open vSwitch code that DigitalOcean itself is working to contribute.

Previously, Matt gave a talk at Open vSwitch 2017 Fall Conference about the use of Go with Open vSwitch at DigitalOcean.

Episode 60: Oko: Extending Open vSwitch with Stateful Filters, with Paul Chaignon from Orange Labs and Inria Nancy (Aug 17, 2018)

Paul Chaignon is a grad student at Orange and Inria. In this episode, Paul talks about Oko: Extending Open vSwitch with Stateful Filters, a paper written with co-authors Kahina Lazri, Jérôme François, Thibault Delmas, and Olivier Festor. Paul presented this research at SOSR '18 in March 2018. The paper has the following abstract:

With the Software-Defined Networking paradigm, software switches emerged as the new edge of datacenter networks. The widely adopted Open vSwitch implements the OpenFlow forwarding model; its simple match-action abstraction eases network management, while providing enough flexibility to define complex forwarding pipelines. OpenFlow, however, cannot express the many packets processing algorithms required for traffic measurement, network security, or congestion diagnosis, as it lacks a persistent state and basic arithmetic and logic operations.

This paper presents Oko, an extension of Open vSwitch that enables runtime integration of stateful filtering and monitoring functionalities based on Berkeley Packet Filter (BPF) programs into the OpenFlow pipeline. BPF programs attached to OpenFlow rules act as intelligent filters over packets, while leaving the packets unmodified. This approach enables the transparent extension of Open vSwitch's flow caching architecture, retaining its high-performance benefits. Furthermore, the use of BPF allows for safe runtime extension and prevention of switch failures due to faulty programs.

We compare our implementation based on Open vSwitch-DPDK to existing approaches with comparable isolation properties and measure a near 2x improvement of performance.

You can contact Paul on Twitter as @pchaigno.

Episode 59: Incremental Processing with ovn-controller, with Han Zhou from eBay (May 23, 2018)

Han Zhou is an architect working on highly scalable and reliable SDN solutions for eBay's cloud infrastructure. He is an active contributor in OpenStack and OVS/OVN. Before eBay, he has been working in networking area for more than 10 years in Cisco and Nokia.

In OVN, the ovn-controller daemon runs on each hypervisor. It obtains logical flows from the OVN southbound database, transforms them into "physical flows," and pushes the physical flows into ovs-vswitchd over an OpenFlow connection. In the current implementation, whenever any of the tables in the OVN southbound database changed, the daemon would fully recompute all of the physical flows to be pushed into ovs-vswitchd. With a sufficiently large setup, this is expensive. This talk is about Han's patches to make computation in ovn-controller incremental, so that a small change in the input causes only a small amount of computation.

Some of the talk is easier to follow if you view the slides (PDF).

Han was previously featured in OVS Orbit in episode 36, where he spoke about Baker, an approach used by eBay to combine OVN with Kubernetes.

Episode 58: Toward Leaner, Faster ovn-northd, with Leonid Ryzhyk from VMware Research Group (May 22, 2018)

Leonid Ryzhyk is a senior researcher in the VMware Research Group. The main theme of his work is applying formal methods to build better operating systems and networks. Before joining VMware, Leonid received his PhD from University of New South Wales and NICTA. Leonid has also worked as a researcher at NICTA, as a postdoc at University of Toronto and at Carnegie Mellon University, and as a researcher at Samsung Research America.

In OVN, the ovn-northd daemon acts as an interface and a translator between OVN's northbound and southbound databases. With the existing implementation, any change in the northbound database causes ovn-northd to do a full recomputation of the complete contents of the southbound database. For a large network, this is slow—it can take multiple seconds of CPU time—regardless of the size of the change in the northbound database. Therefore, even a small change, such as adding or removing a single port or a single VM, can take a relatively long time to be realized in the network.

In this talk, Leonid presents a prototype for a solution to the problem. The solution implements incremental computation using a system called Differential Dataflow, which is based on the Datalog language for database queries. Because Datalog is not a particularly friendly language for developers who are not already accustomed to it, Leonid also layered syntactic sugar over it called FTL, for Flow Template Language, which is inspired by the FLWOR syntax from XQuery.

Some of the talk is easier to follow if you view the slides (PDF).

You can contact Leonid via email at lryzhyk@vmware.com.

Leonid previously appeared in OVS Orbit in episode 44 on the Cocoon-2 network programming system. Episode 5, with Teemu Koponen and Yusheng Wang, touched on related concepts with its coverage of the nlog language which is also based on Datalog.

Episode 57: OpenFaaS, with Alex Ellis from VMware (May 17, 2018)

Alex Ellis founded and leads the OpenFaaS project, which is an open source implementation of a serverless framework. This episode is an interview with Alex. We talk about serverless functions and OpenFaaS, how OpenFaaS compares to other serverless frameworks, how networking works in OpenFaaS, open source communities, and other related topics. We also take on a couple of questions asked by listeners on Twitter.

For more information about OpenFaaS, visit openfaas.com. To get in touch with Alex, you can tweet to him as @alexellisuk.

Episode 56: Flow Translation (Apr 29, 2018)

This is the third in a series of Open vSwitch tech talks that we are starting to run internally at VMware every week or two. This episode is about flow translation, the process that OVS follows when a packet arrives in the software switch that does not match any already established entry in the OVS datapath cache (the megaflow cache).

The translation process has two goals. First, it figures out what to do with the particular packet being processed Second, it determines what class of packets similar to this packet can be treated the same way. The former determination is generally applied directly to the packet in question, and the latter is used to add a new cache entry (megaflow) to the datapath.

The main source files involved in translation are ofproto-dpif-xlate.c and ofproto-dpif-xlate.h.

In addition to the translation process itself, the talk covers some uses of relevant tools: ovs-dpctl dump-flows and ovs-appctl dpctl/dump-flows for viewing datapath cache entries, and ovs-appctl ofproto/trace for understanding how cache entries are devised and populated and playing "what-if?" games. These are documented in the ovs-vswitchd and ovs-dpctl manpages.

The NSDI 2015 paper The Design and Implementation of Open vSwitch covers in detail how Open vSwitch caches translations.

Episode 55: Introduction to OVSDB (Apr 7, 2018)

This is the second in a series of Open vSwitch tech talks that we are starting to run internally at VMware every week or two. This episode is particularly about OVSDB, the Open vSwitch Database. It starts out with material explaining why OVSDB exists at all, given that there is so much other work in databases and in network configuration. Then it moves on to what OVSDB is and its basic features, including the features that its schemas support. It also discusses the basics of the OVSDB network protocol. It concludes with several minutes of questions.

Episode 54: RCU in Open vSwitch Userspace (Apr 7, 2018)

This is the first in a series of Open vSwitch tech talks that we are starting to run internally at VMware every week or two. This episode is particularly about read-copy-update, or RCU for short, which is a synchronization technique that allows reads to be very cheap (almost free), with some memory cost. This talk covers reasons why RCU can be preferred over other techniques, such as readers-writer locks, and its relationship with lockless synchronization and reference counts. It also covers some of the RCU API within OVS (and why OVS has its own API instead of using liburcu).

The ovs-rcu.h header from the OVS source repository is useful reading along with this episode.

Episode 53: Ten Years of Open vSwitch Success and Failure (Apr 7, 2018)

This is a talk that I gave at an internal VMware event called "Open Source Day" in February. I prepared it as a kind of internal recognition of the ACM SOSR conference awarding Open vSwitch its Software Systems Award. It goes over a history of Open vSwitch and the motivation behind it, factors that led to its success, and some of the project's failures and mixed results.

The accompanying slides are available as PDF, although it is not necessary to view them to follow along.

Episode 52: Enterprise SDN, with Greg Ferro from Packet Pushers (Mar 1, 2018)

Greg Ferro is one of the Packet Pushers, a host of much more popular podcasts than this one. Greg's bio says:

Greg survived 25+ years of Enterprise IT as a network engineer, architect and designer. Involved with a wide range of companies in gaming, online, finance, carriers, energy and other, he was a team member or leader that designed, built and deployed quite a few medium & large solutions for well known large companies. He was CCIE#6920 (and a bunch of others) but thats not relevant now.

The conversation in this episode focuses on the relationship between virtual switching and enterprise networking. According to Greg, microsegmentation is the key selling point for SDN in the enterprise. The idea of pulling a whole physical network into a virtual environment, which was one of Nicira's use cases, is starting to acquire some currency in enterprises, although there's a great deal of stickiness from sales of physical hardware firewalls and other appliances:

"...so the customer says, 'I really like that poop sandwich, can I have another one?' and they don't realize that right next to it is a chicken sandwich if only they knew to ask for a chicken sandwich, so they get the poop sandwich and they go, 'Mmm, tastes just like the last one! Exactly what I wanted!'"

One important aspect of the Nicira vision was agility, the ability to add or change networks quickly without involving the networking team. According to Greg, this is not yet important to enterprises because they lack the belief that it really works:

"They're too used to being lied to... When you come to them and say, 'We've got all this agility and speed!' they just at you going, 'Why would I need that?' ... They don't trust what they're being told because they have a history of getting un-trustable advice."

Ben and Greg also briefly discuss SD-WAN and NFV. Greg expresses a theory that the end of net neutrality will terminate telco interest in ONAP and CORD. Greg expresses positivity about hardware with flow-based control APIs such as OpenFlow and P4.

Greg offers an opinion about public cloud in enterprises:

"I think we're going to see most people go into the public cloud, re-engineer, learn cloud principles, and then start to deploy it back. When will that happen? When we start to see the legacy IT vendors build hyperconverged platforms running things like OpenShift, and you won't even know it's OpenShift, it'll just be a private cloud, and clicky-clicky, here's an IaaS, here's a VM, here's a storage, here's a connection to the Internet, there's my public IP, boom!"

For more information about Greg, visit etherealmind.com. You can contact Greg via Twitter as @EtherealMind. For more information about Packet Pushers, visit packetpushers.net.

For the "reverse" of this podcast, where Greg interviews Ben, see PQ 138: Inside Open vSwitch.

Episode 51: Network Stack as a Service, with Henry Xu from City University of Hong Kong (Feb 15, 2018)

Henry Xu is an assistant professor in the computer science department at the City University of Hong Kong, where he leads the NetX Lab. In this episode, we discuss the research behind his group's recent paper Network Stack as a Service in the Cloud, presented at HotNets 2017. The paper has the following abstract:

The tenant network stack is implemented inside the virtual machines in today's public cloud. This legacy architecture presents a barrier to protocol stack innovation due to the tight coupling between the network stack and the guest OS. In particular, it causes many deployment troubles to tenants and management and efficiency problems to the cloud provider. To address these issues, we articulate a vision of providing the network stack as a service. The central idea is to decouple the network stack from the guest OS, and offer it as an independent entity implemented by the cloud provider. This re-architecting allows tenants to readily deploy any stack independent of its kernel, and the provider to offer meaningful SLAs to tenants by gaining control over the network stack. We sketch an initial design called NetKernel to accomplish this vision. Our preliminary testbed evaluation with a prototype shows the feasibility and benefits of our idea.

Episode 50: Hardware Acceleration on NICs, with Andy Gospodarek from Broadcom (Jan 31, 2018)

This episode features Andy Gospodarek, a Principal Engineer at Broadcom, where he works as a software architect on the NICs team. This interview, recorded at the DPDK summit in San Jose on November 15, 2017, focuses on his team's work on offloading flows to Broadcom NICs, using the user's choice of the Linux “TC” flow offload API or a Broadcom-specific API. Toward the end, he also talks about a different kind of hardware acceleration using a “Smart NIC” with a multicore general-purpose CPU that can run Open vSwitch or other popular networking software.

Andy also spoke about acceleration with Smart NICs at Open vSwitch Fall Conference 2018. Slides and video are available.

You can contact Andy as @gospo on Twitter.

Episode 49: Open Compute Project Networking, with Andrew "Puck" Ruthven from Catalyst IT (Jan 11, 2018)

Andrew Ruthven, aka “Puck”, is a Data Centre Manager at Catalyst IT, a New Zealand open source specialist IT company whose services include operating an OpenStack based public cloud.

This episode is based on “Open Compute Project, down under”, a talk that Andrew gave at OpenStack Sydney, which had the following abstract:

This talk will review the Open Compute Project, where it came from, where things are now and where things are headed. We will cover benefits that Open Compute Project hardware provide over the typical server hardware that we’re used to. We will discuss how it works for the large scale operators, and the issues that affect the small scale operators from the perspective of a small scale operator.

The discussion begins with an introduction to the Open Compute Project (OCP) and its history and motivations. Andrew covers challenges in bringing the Open Compute Project to New Zealand, which included setting up supply and support chains for the hardware. More relevant to our podcast focus, in addition to servers, OCP includes hardware for “white box” switches and access points, and Andrew talks about how those work and how they fit into a data center infrastructure. Andrew also talks about the future of OCP and SDN at Catalyst and in New Zealand.

Toward the end, we digress into a side discussion of how, on the commercial side of the industry, hardware is becoming more and more open, with OCP as an example, whereas on the consumer side, hardware is becoming more and more locked down, and Andrew and Ben offer their thoughts on the situation.

You can find Andrew on Twitter as @puck_. To find out more about Open Compute Project, visit the project website.

Episode 48: What's New in OVN 2.8, with Ben Pfaff from VMware (Dec 15, 2017)

This is a recording of a talk given by Ben Pfaff at the OpenStack Summit in Sydney on Monday, Nov. 6, with the follow abstract:

The Open vSwitch (OVS) community recently released version 2.8. The release includes updates to both OVS and OVN (Open Virtual Network), which provides virtual networking for OVS. This presentation will primarily focus on updates to OVN and how OpenStack can use it as a backend for OpenStack Neutron. OVN has continued to grow and mature as an OpenStack networking solution. Recent additions include HA for L3 gateways, native DNS support, ACL (security group) logging, and support for the OpenStack metadata API. We will also discuss current and future developments for OVS and OVN.

This talk was written up in the OVN documentation, so you can follow along with it there if you like.

Previously, Episode 18 covered the launch of OVN.

Episode 47: Routing a Production Enterprise Network with Faucet, with Brad Cowie from WAND (Nov 30, 2017)

Brad Cowie is a member of the WAND Network Research Group at the University of Waikato, in Hamilton, New Zealand. He is also a core member of the Faucet project which develops an open source OpenFlow controller for enterprise networks, which he uses to build production OpenFlow networks.

This episode is a recording of a talk that Brad gave at the OpenStack Summit in Sydney on Nov. 6, with the following abstract:

Within the WAND network research group at the University of Waikato, we operate a 100% OpenFlow-controlled network for research/teaching/BYOD traffic. The access/aggregation layer of this network is provided by vendor OpenFlow switches (most running embedded OpenvSwitch with ASIC SDK drivers to drive merchant silicon). The core routing for the network is provided by a linux server running a high-speed userspace OpenvSwitch bridge accelerated by DPDK. The entire network is controlled by our open source OpenFlow v1.3 controller, Faucet. We also introduce our test-driven methodology for implementing network features in Faucet and how this can also be applied to operating a network.

You may want to view the slides that accompany the talk.

For more information on Faucet, visit the Faucet website. You can reach Brad as gizmoguy on IRC or @nzgizmoguy on Twitter.

OVS Orbit has previously covered Faucet in Episode 45: Faucet and OpenFlow at Allied Telesis, Epsiode 33: Lightning Talks, and Episode 19: The Faucet SDN Controller.

Episode 46: In-band Network Telemetry, with Chang Kim from Barefoot Networks (Nov 16, 2017)

Chang Kim is an engineer at Barefoot Networks, where he has been intimately involved in the design of the P4 domain-specific language for controlling a network data plane.

Whereas the control and management planes in a network system are general-purpose software implemented on CPUs, the data plane in a high-speed network is typically implemented in an ASIC dedicated to packet forwarding. State-of-the-art switching ASICs can handle multiple Tbps (and Gpps). Until recently, these ASICs were designed as fixed-function devices: they implemented the specific protocols they supported in a hardware description language such as Verilog. Now, trends in hardware design have converged to make it possible to build much more general-purpose networking ASICs. When this happens, some language is needed for programming the ASICs. P4 is the leading candidate for a cross-platform language for this purpose.

This episode focuses not on P4 itself but on In-Band Network Telemetry (INT), one application of P4. INT is, according to Chang, a “low-hanging fruit” application of programmable network data planes, which can be conveniently implemented in P4. INT embeds information about the conditions experienced by a particular packet traveling through the network into the packets themselves. As a packet travels through the network, each intermediate switch collects relevant information, such as queuing information, arrival or departure timestamps, etc. and adds it to the packet. When the packet leaves the network at a sink switch, it takes the accumulated information and, optionally, passes it along to a collector. (Most applications will not pass every packet's information to a switch because that effectively doubles the amount of traffic in the network.)

INT can offer real-time network monitoring information to administrators. Its reports can also be archived to set a baseline or for long-term storage.

INT metadata can be integrated into a packet in a variety of ways. The per-packet metadata is typically 16 to 20 bytes per hop, which is a significant amount of data that cannot easily be fit into existing fixed-size fields such as DSCP or PCP. Geneve or NSH options are two reasonable places, and the metadata can also be inserted in nonstandard places (such as just after an L4 header) as long as the sink switch is configured to strip them before passing them along to an end host.

To encourage industry-wide implementation of interoperable versions of INT, Barefoot has made available a specification, which includes a working implementation in P4, and released it to the P4 consortium. The specification will continue to be developed and maintained by the new P4 Applications Working Group. IETF also has a new In-Situ OAM (ioam) working group with related goals.

To get involved with P4, including the new P4 Applications Working Group, please visit p4.org.

For more OVS Orbit discussion of P4, please see Episode 11: P4 on the Edge and Episode 9: Adding P4 to OVS with PISCES.

Episode 45: Faucet and OpenFlow at Allied Telesis, with Tony Van der Peet (Oct 31, 2017)

Tony van der Peet is Chief Architect at Allied Telesis. In this episode, he speaks about using OpenFlow on Allied Telesis networking hardware for enterprise SDN.

The Allied Telesis implementation of OpenFlow is based on Open vSwitch. According to Tony:

“We've tried really from the word 'go' to develop the most generic OpenFlow switch that we can, and obviously Open vSwitch is a very fully featured version of OpenFlow. The trick, then, is how to integrate that with your hardware solution. The solution we've come up with is to let Open vSwitch do what it does best, which is to manage the tables, and to work out what needs to happen to a particular flow, and for us to then take that and to translate it as best we can into our hardware tables.”

Most Open vSwitch downstream projects use an official Open vSwitch release as their base, but Allied Telesis uses the tip of master:

“I discovered a long time ago that all the good stuff was on master, so we had a policy very early on of going to the tip of master at the time we felt it was right to do an upgrade from upstream. Sometimes that means we take a bit of rough with the smooth, but we're prepared for that.”

Allied Telesis has a policy of staying close to upstream and pushing back changes:

“We do have an active policy of pushing patches upstream on any of the projects we're associated with. We have done a couple of patches back up to Open vSwitch and we do have a bunch more, and I've already discussed with you the idea that we might push up some of our patches we've done recently to give us conformance to the ONF official conformance program.”

Tony describes Allied Telesis's experience of the ONF conformance testing process. They began by writing their own versions of all 300+ tests that the process includes. Writing the tests took about 1 hour per test, on average, so this was a significant investment in time. The investment paid off because it gave Tony and Allied Telesis experience with the tests and enough information to intelligently argue with the conformance lab.

Tony became involved with Faucet because of the New Zealand connection: both Tony and Faucet are based there. Allied Telesis appreciates how Faucet as an open source project allows a way to show an OpenFlow use case for its switches.

Allied Telesis is also involved with OpenDaylight and other controllers.

Allied Telesis will likely join the nascent Faucet Foundation.

For more information about OpenFlow at Allied Telesis, visit the Allied Telesis website.

Episode 44: Cocoon-2, with Leonid Ryzhyk from VMware Research (Oct 24, 2017)

Leonid Ryzhyk is a senior researcher at VMware Research in Palo Alto. He focuses on applying formal methods to improve operating systems and networks.

This episodes discusses Cocoon-2, a system that Leonid is building to automate the tedious tasks involved in SDN programming. One problem that it aims to solve is incrementality, that is, the need to avoid recomputing all of the state in an SDN system given a small change to its configuration.

Numerous academic SDN programming languages exist. Many of these think of the network in terms of an automaton. NetKAT is a good example, which regards the network as a finite state machine that manipulates network packets. Other languages take a contrary view of the network as a collection of database tables and compute state via views and queries on these tables. Cocoon's innovation is that it takes both views: it supports a relational model with a Datalog engine for reasoning about computation and an imperative language for describing the data plane, and allows the programmer to decide the most appropriate tool for any given part of the implementation.

A publication for Cocoon-2 is planned for submission to SIGCOMM 2018. Until then, you can look at the Cocoon2 Github repository, including a simple example. For information on the prior Cocoon work, consult the NSDI 2017 paper.

For more information on the view of a network as a database, you might listen to Episode 5, about the nlog database language.

You can contact Leonid via email at lryzhyk@vmware.com.

Episode 43: Fuzzing Frameworks, with Bhargava Shastry from TU Berlin (Oct 3, 2017)

Bhargava Shastry is a Ph.D. student in the Chair for Security in Telecommunications at Technical University Berlin. Bhargava develops tools that enable early detection and fixing of security vulnerabilities.

Among other topics, this episode discusses Bhargava's paper “Static Exploration of Taint-Style Vulnerabilities Found by Fuzzing,” which was presented at WOOT '17, the Workshop on Offensive Technologies. The paper's abstract is:

Taint-style vulnerabilities comprise a majority of fuzzer discovered program faults. These vulnerabilities usually manifest as memory access violations caused by tainted program input. Although fuzzers have helped uncover a majority of taint-style vulnerabilities in software to date, they are limited by (i) extent of test coverage; and (ii) the availability of fuzzable test cases. Therefore, fuzzing alone cannot provide a high assurance that all taint-style vulnerabilities have been uncovered.

In this paper, we use static template matching to find recurrences of fuzzer-discovered vulnerabilities. To compensate for the inherent incompleteness of template matching, we implement a simple yet effective match-ranking algorithm that uses test coverage data to focus attention on matches comprising untested code. We prototype our approach using the Clang/LLVM compiler toolchain and use it in conjunction with afl-fuzz, a modern coverage-guided fuzzer. Using a case study carried out on the Open vSwitch codebase, we show that our prototype uncovers corner cases in modules that lack a fuzzable test harness. Our work demonstrates that static analysis can effectively complement fuzz testing, and is a useful addition to the security assessment tool-set. Furthermore, our techniques hold promise for increasing the effectiveness of program analysis and testing, and serve as a building block for a hybrid vulnerability discovery framework.

You can tweet to Bhargava as @ibags or to the Security in Telecommunications Research Group at @fgsect. Visit Bhargava's TU Berlin home page for more contact information.

Episode 42 covered a different research effort fuzzing Open vSwitch.

Episode 42: FlowFuzz, with Nicholas Gray and Thomas Zinner from University of Würzburg (Sep 22, 2017)

Nicholas Gray is a PhD student at the University of Würzburg, Germany, where he also completed his Master's thesis in 2015. His research interests include SDN/NFV architectures and their impact on network security. Thomas Zinner received his Diploma and Ph.D degrees in computer science from the University of Wurzburg, Germany, in 2007 and 2012, respectively. Nicholas is a member of the research group headed by Thomas on “Next Generation Networks” at the Chair of Communication Networks, University of Würzburg.

This episode is about FlowFuzz, a framework for fuzzing OpenFlow-enabled software and hardware switches. It covers the material presented by Nicholas at Black Hat Briefings on July 26, in a session with the following abstract:

Software-defined Networking (SDN) is a new networking paradigm which aims for increasing the flexibility of current network deployments by separating the data from the control plane and by providing programmable interfaces to configure the network. Resulting in a more agile and eased network management and therefore in cost savings, SDN is already deployed in live networks i.e. Google's B4 backbone and NOKIA's cloud infrastructure. Despite these benefits, SDN broadens the attack surface as additional networking devices and protocols are deployed. Due their critical role within the softwarized management of the network, these devices and protocols are high ranked targets for potential attackers and thus require extensive testing and hardening.

In this work, we present FlowFuzz a fuzzing framework for SDN-enabled software and hardware switches. In particular we focus on the OpenFlow protocol which is currently the de facto standard communication protocol between SDN-enabled switches and the central controlling instance. Whereas the framework utilizes the output of conventional tools such as AddressSanitizer for investigating software switches, it also evaluates data obtained from side channels, i.e., processing times and power consumption to identify unique code execution paths within hardware switches to optimize the fuzzing process. Furthermore, we use our framework implementation to perform a first evaluation of the Open vSwitch and a total of four SDN-enabled hardware switches. We conclude by presenting our findings and outline future extensions of the fuzzing framework.

For more information on the group that produced this research, please visit sardine-project.org.

Episode 41: DPDK Introduction, with Harry van Haaren and Dave Hunt from Intel (Sep 1, 2017)

Dave and Harry are developers who work at Intel on DPDK, the Data Plane Development Kit, which is a library for high-performance packet processing in userspace. This episode is an introduction to DPDK, its history, status, and its future, and how it relates to Open vSwitch and the DPDK datapath included in Open vSwitch.

Episode 40: OpenStack Cyborg, with Howard (Zhipeng) Huang from Huawei (Aug 20, 2017)

Howard is an IT standards engineer at Huawei. He has been working on open source for about 6 years and attended the early Open vSwitch conferences, including the earliest one held at Cisco in 2014. In this episode of OVS Orbit, Howard talks about the OpenStack Cyborg project in OpenStack, which was formerly called Nomad. Howard is the “caretaking” PTL for the project.

The Cyborg project supports all kinds of hardware acceleration for OpenStack. It spans networking, storage, and other areas, primarily because users tend to think of acceleration as a single bucket of features. Acceleration features across these multiple areas do require common lifecycle management processes, supplied by Cyborg as as a common management framework.

Open vSwitch relates to Cyborg on the network acceleration angle. Howard says that OVS can benefit from lifecycle management for hardware acceleration, especially when the DPDK datapath is involved. This could allow users to ensure that OVS is always properly configured for the local hardware. Even in the absence of special acceleration hardware, DPDK requires NIC- and CPU-specific configuration and tuning that amounts to a kind of software acceleration, which Cyborg can help to manage.

Howard says that the project is planning an initial release for September 2017, in time for the OpenStack developer meeting.

The Cyborg website and master slide deck are good basic resources for learning about Cyborg.

You can contact or follow Howard on Twitter as @nopainkiller.

Episode 39: BigBug: Practical Concurrency Analysis for SDN, with Ahmed El-Hassany from ETH Zürich (Aug 1, 2017)

Ahmed El-Hassany is a second-year Ph.D. student at ETH Zürich, who researches ways to make networks more programmable and to verify the correctness of programmable networks. I caught up with Ahmed at SOSR, the Symposium on SDN Research, where he presented BigBug: Practical Concurrency Analysis for SDN, a paper that he authored along with Roman May, Laurent Vanbever, and Martin Vechev. The paper's abstract is:

By operating in highly asynchronous environments, SDN controllers often suffer from bugs caused by concurrency violations. Unfortunately, state-of-the-art concurrency analyzers for SDNs often report thousands of true violations, limiting their effectiveness in practice.

This paper presents BigBug, an approach for automatically identifying the most representative concurrency violations: those that capture the cause of the violation. The two key insights behind BigBug are that: (i) many violations share the same root cause, and (ii) violations with the same cause share common characteristics. BigBug leverages these observations to cluster reported violations according to the similarity of events in them as well as SDN-specific features. BigBug then reports the most representative violation for each cluster using a ranking function.

We implemented BigBug and showed its practical effectiveness. In more than 100 experiments involving different controllers and applications, BigBug systematically produced 6 clusters or less, corresponding to a median decrease of 95% over state-of-the-art analyzers. The number of violations reported by BigBug also closely matched that of actual bugs, indicating that BigBug is effective at identifying root causes of SDN races.

For more information on BigBug, visit the SDNRacer website. You can find contact information for Ahmed on his webpage.

Episode 38: Control and Management Plane for IO Modules, with Fulvio Risso from Politecnico di Torino (Jul 9, 2017)

Fulvio Risso is an associate professor at Politecnico di Torino in Turin, Italy. His research is in the area of high-speed (10+ Gbps) packet processing and especially in programmable networks. This interview was prompted by Fulvio's presentation “A Control and Management Plane for IO Modules” at the IO Visor Summit held on Feb. 27 in Mountain View (see Quentin Monnet's excellent summary of the summit for more information).

The episode begins with a few words about OSSN 2017, the 2nd International Workshop on Open-Source Software Networking, which Fulvio co-chairs. The interview occurred long before the workshop, which took place in early July. See also the workshop schedule.

An IO Module, according to Fulvio, is a kind of marketing term for an eBPF program. In turn, eBPF is an abstract, portable, safe virtual machine that Linux allows userspace programs to install into the kernel to monitor, augment, or control kernel behavior. OVS Orbit has previously covered eBPF in episode 4 on Cilium, episode 23 on the IO Visor project, and other episodes.

According to Fulvio, for NFV packet processing, IO Modules have three primary advantages over other approaches. First, they cab directly injected into a kernel. Second, potentially, IO Modules can be injected at different layers, for example in the main networking stack or in a networking driver. Third, eBPF programs are more portable than native code, which can be an important advantage for heterogeneous telco environment.

Fulvio gives an example of the use of IO Modules for implementing the functionality of a residential network gateway. IO Modules implement DHCP, routing, NAT, deep packet inspection, etc. The advantage of an IO Module architecture over the traditional home gateway design is modularity: the IO Module design can be easily changed and adapted to suit a new use case.

The IO Modules repository includes a controller named Hover that is the main method proposed for IO Module deployment and (re)configuration. Northbound, Hover provides a high-level REST API that accepts IO Modules or C code that can be transformed into eBPF; southbound, it talks to the Linux kernel to enable and connect IO Modules. In addition, it provides some helpers that allow IO Modules to handle points of eBPF programming that are currently tricky.

There is a clear analogy between Open vSwitch/OpenFlow and IO Modules/Hover. In both cases, there is a kernel-based fast path, that sometimes needs to consult a userspace-based slow path, which occasionally needs to consult a controller over the network. Open vSwitch uses OpenFlow for communication with the controller; for Hover, the protocol is yet to be determined. Fulvio is considering whether to use an existing controller such as ONOS.

Ben and Fulvio discuss how to divide the implementation between fast path and slow path, with ARP as an example.

Fulvio briefly discusses the performance of IO Modules, which is generally competitive with related technologies. The goal, however, is not performance, but flexibility.

In the future, Fulvio plans to dedicate resources to bringing IO Modules into the larger IO Visor community. Fulvio is also concerned that the IO Visor community is tied too tightly to individual companies. He hopes for the community to expand further into university and research environments.

Episode 37: New Approach to OVN Datapath Performance, with Jun Xiao from CloudNetEngine (Jun 12, 2017)

Jun Xiao is the founder and CTO of CloudNetEngine, which is focused on innovating a next generation “engine” for cloud virtual networking. He has over 15 years experience in VMware, Huawei, Lucent, Sun Microsystem, and other companies. His system design experience ranges from low-level device drivers and network stacks to middleware and distributed systems.

This episode is a talk that Jun gave at OpenStack Boston during the Open vSwitch Open Source Day on May 10, with the following abstract:

Jun presents a new high-performance datapath for OVN. OVN has many advantages than other virtual networking solutions given its great architecture thus performant/extensible/scale, and it becomes the best virtual networking solution if you want to use OVS with openstack. In this presentation, Jun will show how to deploy openstack/OVN with yet another high performance OVS datapath, and deep dive on technical solutions addressing OVS datapaths outstanding issues. Jun will also give a live demo for the integration and present some performance benchmarks.

Slides for this talk are available in PowerPoint and PDF formats. The PowerPoint version may be preferable because it preserves the animations.

Episode 36: Baker: Scaling OVN with Kubernetes API Server, with Han Zhou from eBay (Jun 11, 2017)

This episode is a talk that Han gave at OpenStack Boston during the Open vSwitch Open Source Day on May 10, with the following abstract:

Han presents “Baker,” a new approach used by eBay to combine OVN with Kubernetes API server to meet the scalability and availability goal in a large scale production environment, and how it is integrated with Neutron Security Group to achieve micro-segmentation. This allows OVN to scale beyond the limits of OVSDB as the centralized data store.

Slides from this talk are available in PDF format.

Episode 35: OVN Support for Multiple Gateways and IPv6, with Russell Bryant and Numan Siddique from Red Hat (Jun 11, 2017)

Russell Bryant is a Senior Principal Software Engineer at Red Hat, Inc. He was on the OpenStack Technical Committee from Fall 2012 until Fall 2016 and was elected to serve on the OpenStack Foundation Board of Directors in 2015, 2016, and 2017. Russell has been contributing to the development of OpenStack since the Fall of 2011. His most significant contributions around OpenStack have been to Nova and most recently Neutron and OVN. Russell is also a project committer for Open vSwitch (OVS) and OVN.

Numan Siddique is a developer at Red Hat, Bangalore. He has around 12 years of software development experience. He has been involved with OpenStack since Ice house release. He works mainly on OVN and networking-ovn. Prior to OVN, he was contributing to OpenContrail SDN controller.

This episode is a talk that Russell and Numan gave at OpenStack Boston during the Open vSwitch Open Source Day on May 10, with the following abstract:

OVN supports multiple gateways on a network with traffic spread across them. Russell Bryant will cover the different gateway and NAT modes supported by OVN and how OpenStack can take advantage of multiple gateway support.

OVN supports IPv6. Numan Siddique will cover the details of this support, the present status of the IPv6 features supported and a list of missing IPv6 features if any and a demo showing the supported IPv6 features by OVN.

Slides for this talk are available in PDF format. Some parts of the talk will be easier to follow while looking at the slides.

Toward the end of the questions, Russell refers to the talk OpenStack and OVN: What's New with OVS 2.7, given by Russell Bryant, Ben Pfaff, and Justin Pettit at OpenStack Summit Boston.

Russell previously appeared in episode 18, on the launch of OVN, and episode 7 was based on a talk that he co-presented. Russell also made multiple presentations at Open vSwitch 2015 Fall Conference.

Episode 34: OpenStack Performance with OVS-DPDK for NFV and Connection Tracking, with Sugesh Chandran and Bhanuprakash Bodireddy from Intel (Jun 11, 2017)

Sugesh Chandran is a network software engineer with Intel. His work is primarily focused on accelerated software switching solutions in user space running on Intel architecture. His contributions to Open vSwitch with DPDK include tunneling acceleration and enabling hardware acceleration. Before joining Intel, he has been involved in developing features for Cisco and Procurve switching products.

Bhanuprakash is a software engineer at Intel Corporation focusing on virtual switching solutions. Before joining Intel, he has been involved in building telecom solutions for Asian mobile operators and had significant contributions towards fast-path optimization in vEPC. He also worked on building reliable fronthaul solution at a startup that built vRAN using Ethernet fronthaul. He is also passionate about FOSS and has actively contributed to FreeBSD MIPS and OVS.

This episode is a talk that Sugesh and Bhanuprakash gave at OpenStack Boston during the Open vSwitch Open Source Day on May 10, with the following abstract:

This talk analyzes the performance of OVS-DPDK in two situations relevant to OpenStack.

First, the speakers will analyze the performance of the OVS-DPDK connection tracker, which plays a critical role in protecting tenants and application workloads from network-based attacks. They will demo and wal kthrough their findings with different test topologies that includes multiple VMs, bridges with thousands of connections. They will showcase Vtune results for specific bottlenecks and discuss mitigation strategies.

Second, the speakers will analyze the performance of OVS-DPDK for Network Function Virtualization (NFV). This part of the talk will describe a few real world deployment with OVS-DPDK and walk through various scenarios. This includes latency details, scalability, NUMA node aware allocation, classifier bottleneck, noisy neighbor and hardware acceleration features.

Slides for this talk are available in PDF format. Some parts of the talk will be easier to follow while looking at the slides.

Bhanuprakash previously presented two talks related to OVS-DPDK at Open vSwitch 2016 Fall Conference.

Episode 33: Lightning Talks, with Joe Stringer from VMware and Yusuke Tatsumi from Yahoo! JAPAN (May 20, 2017)

This episode is a series of 3 lightning talks given at OpenStack Boston during the Open vSwitch Open Source Day on May 10.

The first speaker is Joe Stringer, a developer at VMware who works on Open vSwitch. The title of his talk, which starts at 1:03, is “Deploying an OVS-based feature switch in 5 minutes or less.” It consists of a demo showing how to quickly deploy the Faucet open source OpenFlow controller as a drop-in replacement for a network switch. This talk did not include slides, but you can watch a video recording of a similar demo on youtube. For more coverage of Faucet, see episode 19 or Shivaram Mysore's talk at the Open vSwitch 2016 Fall Conference.

The second speaker is Yusuke Tatsumi, a network infrastructure engineer at Yahoo! JAPAN, with “Continuous Integration for IaaS,” about a continuous integration tool chain for Open vSwitch with DPDK. This talk begins at 4:54. Slides are available.

The third speaker is also Joe Stringer, presenting “Cyber RFP!” about the strategy that the Faucet open source OpenFlow controller uses to validate a switch's OpenFlow support using a comprehensive, easy-to-use testsuite. This talk begins at 10:53. Slides are available.

Episode 32: Deploying OVN on Windows with OpenStack and Kubernetes, with Alessandro Pilotti and Alin Balutoiu from Cloudbase Solutions (May 15, 2017)

Alessandro is CTO of Cloudbase Solutions, a company focused on cloud computing interoperability and the main contributor of all the OpenStack Windows and Hyper-V components in Nova, Neutron, Cinder, Ceilometer and Heat since the Folsom release. Alin is a developer at Cloudbase who describes himself as an Open Source enthusiast and a passionate Python developer at heart.

Slides for this talk are available in PDF format.

This episode is a talk that Alessandro and Alin gave at OpenStack Boston during the Open vSwitch Open Source Day on May 10, with the following abstract:

Current Windows datapath development aids users and simplifies networking deployments. One of the improvements brings Linux Bridges (OVS bridges) mentality under Windows. Recent development also allows for OVN to work together with the Windows datapath. This session will cover how it can be achieved on Windows under Hyper-V as well. Among other topics, we will cover how to set up OVN using DevStack and add a Windows node and how to debug Windows-specific problems, with a demo.

In addition, one of the recent and interesting integration of OVN is Kubernetes. This helps users to deploy containers over both public and private clouds. Cloudbase Solutions recently added a PoC (proof of concept) integration for Windows. This allows Windows workloads to be deployed under the same cluster. We will emphasize on the advantages that can be achieved using this type of approach. We will explain how to set up an environment including a Windows node and play a prerecorded demo.

The demos may be of less interest than the rest of the talk because no video is available. If you want to skip over them, they run from 8:27 to 13:06 and from 20:56 to 28:03.

Alessandro previously appeared on OVS Orbit way back in Episode 1: Porting OVS to Hyper-V.

Episode 31: NetBricks: Taking the V out of NFV, with A. Panda from Berkeley (May 13, 2017)

Panda is a PhD candidate in the computer science department at the University of California, Berkeley. In this episode, we discuss the paper “NetBricks: Taking the V out of NFV,” by Panda, Sangjin Han, Keon Jang, Melvin Walls, Sylvia Ratnasamy, and Scott Shenker, which was published in OSDI 2016. The abstract for the paper is:

The move from hardware middleboxes to software network functions, as advocated by NFV, has proven more challenging than expected. Developing new NFs remains a tedious process, requiring that developers repeatedly rediscover and reapply the same set of optimizations, while current techniques for providing isolation between NFs (using VMs or containers) incur high performance overheads. In this paper we describe NetBricks, a new NFV framework that tackles both these problems. For building NFs we take inspiration from modern data analytics frameworks (e.g., Spark and Dryad) and build a small set of customizable network processing elements. We also embrace type checking and safe runtimes to provide isolation in software, rather than rely on hardware isolation. NetBricks provides the same memory isolation as containers and VMs, without incurring the same performance penalties. To improve I/O efficiency, we introduce a novel technique called zero-copy software isolation.

One of my favorite quotes from the discussion is the following:

The fewer lines of code you have to reason about, the easier it is. That's the lesson that, I would guess, if you're a developer, to take away. Don't think of isolation as this thing you do for security. That's one use case, it's not even a very good use case because everyone seems to violate isolation all the time: there's ten bugs on any given day for hypervisors breaking out of their isolation boundary.

For earlier discussion of NFV, refer back to Episode 2: OPNFV and OVS, with Dave Neary from Red Hat or Episode 10: SoftFlow, with Ethan Jackson from Berkeley.

For more information on NetBricks, visit netbricks.io. You can also contact Panda via email or Twitter.

Episode 30: NEAt: Network Error Auto-Correct, with Bingzhe Liu (May 1, 2017)

Bingzhe Liu is a first-year PhD student at University of Illinois at Urbana-Champaign. In this episode, we discuss the paper “NEAt: Network Error Auto-Correct,” by Wenxuan Zhou, Jason Croft, Bingzhe Liu, and Matthew Caesar, which Bingzhe presented on April 4 at SOSR, the Symposium on SDN Research. The abstract for this paper is:

Configuring and maintaining an enterprise network is a challenging and error-prone process. Administrators must often consider security policies from a variety of sources simultaneously, including regulatory requirements, industry standards, and to mitigate attack vectors. Erroneous implementation of a policy, however, can result in costly data breaches and intrusions. Relying on humans to discover and troubleshoot violations is slow and prone to error, considering the speed at which new attack vectors propagate and the increasing network dynamics, partly an effect of SDN. To ensure the network is always in a state consistent with the desired policies, administrators need frameworks to automatically diagnose and repair violations in real-time.

To address this problem, we present NEAt, a system analogous to a smartphone’s autocorrect feature that enables on-the-fly repair to policy-violating updates. NEAt modifies the forwarding behavior of updates to automatically repair violations of properties such as reachability, service chaining, and segmentation. NEAt sits between an SDN controller and the forwarding devices, and intercepts updates proposed by SDN applications. If an update violates the policy defined by an administrator, such as reachability or segmentation, NEAt transforms the update into one that complies with the policy. Unlike domain-specific languages or synthesis platforms, NEAt allows enterprise networks to leverage the advanced functionality of SDN applications while simultaneously achieving strong, automated enforcement of general policies.

The paper has contact information for its authors.

Episode 29: DevPulseCon, with Rupa Dachere from VMware (Apr 7, 2017)

This episode's guest is Rupa Dachere, an engineer at VMware and the executive director and founder of CodeChix, a 501(c)3 charity dedicated to education, advocacy and mentoring of women engineers in industry and academia. Rupa founded CodeChix because she found that, regardless of where she worked in the computer industry, she was the only woman on the team, which led to difficulties in advancing and keeping up with technologies that continue to pop up. The organization started out with small group meetings at Rupa's house, grew slowly through meetups and regular meetings, and now boasts over 400 women engineers as members.

The topic of this episode is primarily DevPulseCon, a two-day technical and educational micro-conference focused on women engineers, developers, users, administrators and geeks working in industry and academia. The conference began in 2015 as Coder[xx] with about 100 attendees, grew to about 120 in 2016, and this year's edition has already sold over 200 tickets. DevPulseCon takes place April 20 and 21, at the Computer History Museum in Mountain View. It is sold out, except for a number of tickets reserved for students, who may use discount code STUDENT-FREE to register for free tickets.

DevPulseCon has three components: technical talks, panel discussions, and hands-on workshops. The full agenda for DevPulseCon is already posted. For those unable to attend in person, the technical talks and possibly some of the panels will be recorded and made available through the DevPulseCon and CodeChix YouTube channels, alongside the videos that are already available from 2015 and 2016.

This year DevPulseCon features three panel discussions, which are popular, interactive parts of the conference that tend to run over their time budgets because the audience doesn't want to stop talking. Rupa is moderating a panel on toxic environments, which tend to affect women more strongly than men, for reasons that are not well established. To encourage participation, the panels are “safe space panels,” meaning that social media use is banned during the panels and attendees are encouraged to forget who says what.

Rupa gives some advice on interviewing women for engineering roles. Some of it I had heard before in unconscious bias training, but some of it was new to me and easy to act on:

With women, you do a little extra background, reading up on what their interests are, what drives them, and then phrase your question along those lines, maybe have a few outlier questions if you think they'll be able to answer it impromptu, and go that way. Most women like to have a scripted set of questions, to make them feel comfortable when they're going into an interview, so they know what to prep for essentially.

Rupa says that CodeChix's primary goal is not recruiting of women to engineering roles but retention in engineering roles:

Recruiting becomes a secondary [goal]... but retention is and always has been the focus for CodeChix and that is mainly because of the way I feel about it. I've been in the industry, I've been on the technical ladder and actually I've fought to stay on the technical ladder. Most places they will try to push you into program management, project management, some sort of management track, and you really have to fight to stay on the technical ladder...

Now that I have CodeChix, I know that I'm not the only one. That has always been and still is the number one goal of CodeChix, to retain the women engineers who are on the technical ladder, keep them on the technical ladder. In fact, this year I am going to go ahead and say that I have invited program managers and product managers to the conference—and these are all people with CS degrees and EE degrees, that are very qualified—I want to bring them back into the technical ladder, out of their program management and product management tracks, and see if I can do that...

I personally feel that it is really a horrible thing if you try to push somebody who is a good engineer out of the technical track into something else that, maybe they might be good at it, maybe they're not good at it, I don't know... I don't want to waste them because it's so difficult to find good engineers... Also, I'm tired of being the only female on the team again.

Ben tells a funny anecdote from the beginning of Nicira, but you'll have to listen to hear it.

DevPulseCon and CodeChix are available through their websites and all your favorite social media channels. You may also get in touch with them through email at contact@devpulsecon.org or contact@codechix.org.

Episode 28: OVSDB Configuration for Hardware VTEPs, with Chandra Appanna from Arista and Bruce Davie from VMware (Mar 31, 2017)

This episode's guests are Bruce Davie, a vice president CTO for APJ at VMware, and Chandra Appanna, an engineer and manager at Arista, who are two of the designers of the Open vSwitch database schema that can be used to control VXLAN forwarding in top-of-rack switches, often called the OVSDB VTEP schema.

The discussion in this episode is related to “A Database Approach to SDN Control Plane Design,” by Bruce Davie and several others, published in the January 2017 issue of SIGCOMM Computer Communications Review, with the following abstract:

Software-defined networking (SDN) is a well-known example of a research idea that has been reduced to practice in numerous settings. Network virtualization has been successfully developed commercially using SDN techniques. This paper describes our experience in developing production-ready, multi-vendor implementations of a complex network virtualization system. Having struggled with a traditional network protocol approach (based on OpenFlow) to achieving interoperability among vendors, we adopted a new approach. We focused first on defining the control information content and then used a generic database protocol to synchronize state between the elements. Within less than nine months of starting the design, we had achieved basic interoperability between our network virtualization controller and the hardware switches of six vendors. This was a qualitative improvement on our decidedly mixed experience using OpenFlow. We found a number of benefits to the database approach, such as speed of implementation, greater hardware diversity, the ability to abstract away implementation details of the hardware, clarified state consistency model, and extensibility of the overall system.

One of the main points in the discussion is why it makes sense to focus on a database schema, rather than on a protocol, as a way of controlling a network switch, and what it means to use a database to control a network. As Bruce Davie says:

I think there was a little bit of, “Well, here's a tool that's lying in our toolkit, let's try to use it,” but it was also because we spent so much time thinking about the information model. We realized, like many problems in networking, this really is a state synchronization problem, and that's kind of what database protocols do, so let's see if the one we've got can be made to work.

Some of the points touched on during the discussion include:

History of the VTEP schema.
The relationship of the meaning of “virtualization” in the terms VLAN and VXLAN and VPN, to its meaning in compute and network virtualization.
Why use OVSDB VTEP instead of something more general, such as OpenFlow, to control physical switch hardware?
Will networking hardware become generally programmable in the future?
Similarities of philosophical underpinnings for OVSDB and Arista OS.
Potential alternatives, if the OVSDB approach had not been taken (BGP? Ad-hoc scripts?) and existing competitors.
State synchronization as the fundamental value of OVSDB.
Should databases be used more frequently as a way to implement networking?
The process used for developing the OVSDB VTEP schema, compared to the process used for developing IETF specifications. Chandra's viewpoint is worth quoting:

To me this felt more familiar, in terms of, this is how the IETF has always tried to do things, “working code and rough consensus,” and that's what happened in this case, right? It solved a real problem, there were enough vendors interested in it. Nicira/VMware kind of spearheaded it, but they built something that others wanted to participate in, and of course we all gave in a lot of inputs. The model definitely does make a lot of sense to the hardware vendors, and that's why it succeeded. We built things in stages, we didn't try to make a perfect solution on day 1, and we actually built working code... I don't find it a severe contrast to how the IETF still wants to do stuff, but, yeah, sometimes it doesn't happen like that.
Uses of the OVSDB VTEP schema beyond those originally envisioned by VMware.
Performance and scale in practice.
Future directions for the OVSDB VTEP schema (security? microsegmentation?) and the ease of extending its capabilities.

You can contact Chandra via email at achandra@arista.com and Bruce via Twitter as @_drbruced.

Episode 27: DragonFlow, with Omer Anson from Huawei (Mar 16, 2017)

Omer Anson is a software developer at Huawei. I caught up with him at OpenStack Barcelona back in October to talk about the Dragonflow project, which is a distributed SDN controller for OpenStack Neutron that supports logical switching and routing and more advanced features.

According to Omer, Dragonflow distinguishes itself from other Neutron drivers by aiming at scale. It intends to scale to thousands or even tens of thousands of compute nodes (hypervisors). It also focuses on scale testing, testing in simulation with 4,000 nodes with very good results, showing a slowdown of only about 5% between 30 nodes and 4,000 nodes for control plane operations such as creating subnets or ports.

Dragonflow features include fully distributed DHCP and ARP responders. The latter feature reduces the amount of broadcast traffic within a cloud. Dragonflow can integrate physical devices and even remote clouds into a logical network through its “remote device” functionality.

The Dragonflow implementation currently builds on top of OpenFlow with Nicira extensions (as implemented by Open vSwitch). The developers are also considering adding support for BPF or P4 datapaths.

The Dragonflow design emphasizes pluggability:

My vision is—our vision is—that everything—every Neutron API that exists—will be written as a Dragonflow application. Currently, Dragonflow itself does very little to nothing with creating flows and instructing the policy of how packets move. Everything is done through applications. We have an L2 application which detects the MAC addresses and knows how to direct them to the correct ports, we have an L3 application... that knows to detect when packets are supposed to reach the router and have the layer-2 addresses replaced...

DragonFlow supports, for example, running on top of any of several database systems. The diversity among those databases is vast, which gives users great freedom to choose one that is well suited for their own needs.

Dragonflow has multiple implementations of L3. One of these populates flows proactively into Open vSwitch, that is, it adds flows without waiting for packets to arrive at the controller. The other uses some proactive population along with reactive population, that is, adding flows in response to packet arrival. Omer explains the rationale for the two implementations and how they work.

A new feature of Dragonflow is that it is now implemented as an “ML2 mechanism driver” rather than a Neutron core plugin. This allows it to better coexist with other Neutron drivers within a single deployment.

Omer mentions some upcoming development work, such as a better, more user-oriented deployment mechanism and integration with additional projects such as Ansible, the OpenStack Kolla project, OpenStack Puppet, and more. Service function chaining support is also in the works.

The Dragonflow project is seeking new contributors. You can find the developers in the #openstack-dragonflow channel in the Freenode IRC network. Omer's nick on Freenode is "oanson".

Episode 26: The Evolution of OpenFlow, with Jean Tourrilhes from HPE Labs and Justin Pettit from VMware (Mar 2, 2017)

Jean Tourrilhes is a networking researcher at HPE Labs who served as the chair of the Open Networking Foundation's OpenFlow standardization working group (the “extensibility” working group) from its inception until 2015. Justin Pettit, who was a founding employee at Nicira and who has continued to work at VMware since it acquired Nicira in 2012, was co-chair of the working group during the same period.

Justin and Jean are two of the authors of “SDN and Standards Evolution: a Standards Perspective,” a paper published in IEEE Computer in Nov. 2014. This paper gives a lot of the facts behind the OpenFlow's evolution, but it doesn't talk at all about what worked well and what didn't in the standardization process. This discussion tries to cover that part of the story, with the goal of helping listeners to understand how they might better approach any future standardization processes in the future.

Some of the topics covered include:

Tension between advocates for hardware and software switch implementations.
The debates behind multiple table support.
Capabilities aka “table features” aka “table type pattern” support.
Will increasing flexibility in hardware enable better OpenFlow-like functionality in the future? At what cost?
Naivete regarding OpenFlow influence over hardware.
Secretiveness of hardware vendors and the OpenFlow hardware vendor working group.
The requirement for prototyping new features added to OpenFlow, along with pluses and minuses for the requirement.
Different categories of switches, beyond “hardware” and “software” classification.
Interactions with testing and interoperability working group, and difficulties with NDAs on testing results.
Lack of OpenFlow reference implementation and Open vSwitch de facto in that role.
Increased power of the working group chair, compared to the IETF.
What is the future of OpenFlow? Is it dying? Or just entrenched?
What's a likely successor to OpenFlow? P4? eBPF?

You can contact Jean via email as jt@labs.hpe.com and Justin either via email as jpettit@ovn.org or on Twitter as @Justin_D_Pettit.

Episode 25: Trumpet: Timely and Precise Triggers in Data Center Networks, with Minlan Yu from Yale (Feb 16, 2017)

Minlan Yu is an associate professor of computer science at Yale and before that at USC. She works in several areas that are relevant to Open vSwitch, including network virtualization and software-defined networking, enterprise and data center networks, and distributed systems.

We begin by talking about ACM SIGCOMM 2017 1st International Workshop on Hot Topics in Container Networking and Networked Systems, which Minlan is co-chairing. “HotConNet” is a new workshop at SIGCOMM this year focusing on networking for containers, which is far from a mature field. The workshop is accepting 6-page papers until March 24. Workshop papers will be presented as part of SIGCOMM 2017 in Los Angeles, August 21 to 23.

The main topic of discussion is Trumpet: Timely and Precise Triggers in Data Centers, published at SIGCOMM in August 2016. The abstract for this paper is:

As data centers grow larger and strive to provide tight performance and availability SLAs, their monitoring infrastructure must move from passive systems that provide aggregated inputs to human operators, to active systems that enable programmed control. In this paper, we propose Trumpet, an event monitoring system that leverages CPU resources and end-host programmability, to monitor every packet and report events at millisecond timescales. Trumpet users can express many network-wide events, and the system efficiently detects these events using triggers at end-hosts. Using careful design, Trumpet can evaluate triggers by inspecting every packet at full line rate even on future generations of NICs, scale to thousands of triggers per end-host while bounding packet processing delay to a few microseconds, and report events to a controller within 10 milliseconds, even in the presence of attacks. We demonstrate these properties using an implementation of Trumpet, and also show that it allows operators to describe new network events such as detecting correlated bursts and loss, identifying the root cause of transient congestion, and detecting short-term anomalies at the scale of a data center tenant.

Afterward, Minlan provides a sneak peak at a paper to be presented at NSDI in March, “CherryPick: Adaptively Unearthing the Best Cloud Configurations.” This paper will present an efficient way to choose the best machine type and parameters in a public from among the hundreds of configurations offered by large public clouds.

Minlan may be contacted via email: minlan.yu (at) yale (dot) edu. You can also follow her on Twitter.

Episode 24: Software Synthesis for Networks, with Nate Foster from Cornell (Jan 29, 2017)

Nate Foster is an Associate Professor of Computer Science at Cornell University and a Visiting Researcher at Barefoot Networks. The goal of his research is developing programming languages and tools for building reliable systems. He received a PhD in Computer Science from the University of Pennsylvania in 2009, an MPhil in History and Philosophy of Science from Cambridge University in 2008, and a BA in Computer Science from Williams College in 2001. His awards include a Sloan Research Fellowship, an NSF CAREER Award, a Most Influential POPL Paper Award, a Tien ’72 Teaching Award, a Google Research Award, a Yahoo! Academic Career Enhancement Award, and the Morris and Dorothy Rubinoff Award.

This is a recording of a talk that Nate gave at the Stanford Networking Seminar, organized by Lavanya Jose and Eyal Cidon, on December 8, 2016. Slides and video (recorded by the organizers) are also available.

Nate describes this talk as:

Software synthesis is a powerful technique that can dramatically increase the productivity of programmers by automating the construction of complex code. One area where synthesis seems particularly promising is in computer networks. Although SDN architectures make it possible to build rich applications in software, programmers today are forced to deal with numerous low-level details such as encoding high-level policies using low-level hardware primitives, processing asynchronous events, dealing with unexpected failures, etc.

This talk will present highlights from recent work using synthesis to generate correct-by-construction network programs. In the first part of the talk, I will describe an approach for generating configuration updates that are guaranteed to preserve specified invariants. In the second part of the talk, I will present an extension that supports finer-grained updates triggered by data-plane events.

Joint work with Pavol Cerny (University of Colorado at Boulder), Jedidiah McClurg (University of Colorado at Boulder), Hossein Hojjat (Rochester Institute of Technology), Andrew Noyes (Google), and Todd Warszawski (Stanford University).

Nick McKeown introduces the talk.

To get the most from this talk, listen to it along while viewing the slides.

Episode 23: The IO Visor Project, with Brenden Blanco from VMware (Jan 16, 2017)

Brenden Blanco is one of the most prolific developers working on the IO Visor Project, a Linux Foundation Collaborative Project. Brenden was an employee at PLUMgrid, the startup behind IO Visor, until it was acquired by VMware.

The interview begins with a history of the layers that stacked up to form IO Visor. The history begins with the publication of The BSD Packet Filter: A New Architecture for User-Level Packet Capture at the 1993 Winter USENIX Conference, in January 1993. This paper introduced BPF, short for Berkeley Packet Filter, for selecting packets to be copied to userspace for analysis. BPF was a register-based, RISC-like virtual machine (analogous to the Java virtual machine). In Brenden's words:

Two of the major points really seemed very wise to me. It must be protocol-independent: the kernel should not have to be modified to add more protocol support... It must be general: the instruction set should be rich enough to handle unforeseen uses. They have a few more points about efficiency and generality, but those two first ones have really ended up creating kind of a platform that has stood the test of time.

Safety was also critical:

Any operating system developer has an inherent distrust of the userspace, and the system calls that they're defining need to be secure against attack, as well as ignorance or bad programming. The API they expose lets a code be uploaded but the kernel has to protect itself against bad code. It doesn't allow loops, so you can't create an infinite loop in this virtual machine, and it doesn't allow access to memory that's out of bounds... That produces something that's limited, but safe. You can't do everything that a programmer would want, but you can do so efficiently.

BPF came to Linux during the development of the 2.5 release series, according to LWN.

A second fork of the history comes from academic research on operating system extensibility, which was a theme that grew to prominence during the 1990s with extensible research operating systems such as SPIN and the Exokernel. Many different approaches were proposed, including those based on safe languages like Modula-3 and Java. In addition, Brenden draws an analogy between BPF and the development of instruction set-based GPUs.

Brenden says that the movement to software-defined networking has not been able to speed up operating system evolution. Part of the goal of IO Visor is to help speed up this evolution, by allowing the networking subsystem to evolve independent of the rest of the operating system. In Brenden's words:

Five years ago or so, when we started seeing a move toward software-defined networking, the hope was that some of this paradigm would change a little bit. But what I've seen in some of the solutions is that some of this actually isn't the case. You have a movement of network functionality from hardware to software but it's still locked into kind of the operating system life cycle, which is maybe faster than hardware but isn't in data centers where we've been trying to address this, it's still slower than the applications change, so the use cases still change faster than the infrastructure.

Around 2010, Alexei Starovoitov started bringing together these two branches by extending BPF to form eBPF, which added numerous features such as extending registers to 32 to 64 bits, increasing the number of registers from 2 to 10, additional instructions, and the ability to jump forward and backward (with some restrictions) and most importantly, the ability to call a restricted set of kernel helper functions and the ability to work with data structures (“maps”). These changes made the platform a better target for compiling high-level language code and better able to interact with its environments. In addition, compilers for eBPF started being integrated into the kernel, to make eBPF execution faster on key architectures.

Brenden introduces the BCC, or BPF Compiler Collection, one of the primary sub-projects within IO Visor. BCC provides a set of tools for building software in high-level languages, such as C and Python, into BPF program objects, loading those programs into running kernels, and interacting with them once they are loaded. BCC also includes a large suite of example programs.

BPF programs are loaded into a kernel by attaching them to “hook points.” Typically, the programs attached to a hook point are invoked when some particular event occurs. For example, in networking, a BPF program might be invoked whenever a packet is received on a particular device, or for performance monitoring a BPF program might be attached to a “kprobe” point.

Brenden explains how to use BCC to load a BPF program into the kernel using a 1-line Python program. A more sophisticated program can retain handles and use them to interact with the program at runtime. Brenden describes how restrictions on BPF are reflected in what a C programmer can include in a program. Keeping the in-kernel safety verifier simple and correct is paramount, which tends to reduce the maximum complexity of programs that can be loaded. The verifier is a limiting factor that continues to evolve to make BPF more useful over time.

Currently the Open vSwitch community is considering whether to replace the use of the Open vSwitch kernel module by BPF. Brenden is in favor of the idea and offers some of his thoughts. First, his experience at PLUMgrid shows that BPF is flexible enough to support a wide variety of network applications, including those that Open vSwitch implements. Second, it's an enjoyable experience for a single developer to be able to cover the entire infrastructure for an application.

In the future, Brenden is looking forward to BPF usage becoming ubiquitous for Linux performance monitoring and other non-networking use cases, such as storage acceleration and security. BPF is already, for example, used for sandboxing in Chrome. He's also looking forward to BPF for hardware offload; it can already be used for hardware offload on Netronome NICs.

For more information on IO Visor, please visit iovisor.org. To talk to Brenden and other IO Visor developers, visit the #iovisor channel on the oftc.net IRC network. Brenden's nick is bblanco. You can also tweet to Brenden at @BrendenBlanco.

More about BPF:

In episode 11, John Fastabend from Intel talks about BPF on network edge nodes.
In episode 4, Thomas Graf from Cisco talks about Cilium, which uses BPF to address the question of how to address policy in a legacy-free container environment that scales to millions of endpoints.
Packet Pushers PQ Show 60, from 2015, interviewed Pere Monclus from PLUMgrid about the IO Visor Project and Linux networking.

Episode 22: Benefits of Intent Based Interfaces to IT Infrastructure and Design Problems for Deploying at Scale, with Dave Lenrow from Huawei (Dec 24, 2016)

This episode is a recording of a talk that Dave Lenrow gave at the Stanford Platform Lab Fall Seminar on Dec. 6. Dave is Chief Architect for Next Generation Data Center and Distinguished Engineer at Huawei. He participates in a wide variety of open source networking projects including ONOS, Open Daylight, OPNFV, opensourcesdn.org, and Open-O. He is Chairman of the ONF North Bound Interface (NBI) WG, and a leader in the OSSDN Boulder open source intent reference software project. He has spent more than 20 years driving innovation in digital technology with an emphasis on networks, storage and media.

You may want to follow along with Dave's slides, but the talk makes sense without them.

The talk's abstract is:

The ONF’s North Bound Interface Work Group has been advocating a new interface to infrastructure controllers based on describing what is needed from the system, rather than how the system should provide it. We’ve been trying to create a common NBI that would be used by diverse developers and operators to create a network effect and resulting strong platform and ecosystem. We have been describing this approach as Intent Based networking (IBN) and recently published a consensus based document describing the operating principles of such a system. We define an intent based network system as one with a strict separation of application/workload communication intent, from details about how a network system implements and fulfills such intent (mappings). It turns out that the implementation independent part of the description (the intent) has several valuable properties and the promise to share state in a way that allows massive scale and high performance. To fulfill the promise suggested by this new interface and system architecture, we need to get broad collaboration, and we need to solve some difficult system and platform design problems. The goal of this talk is to get feedback and solicit help from the platform lab in solving some of these problems in an open collaboration. We hope to discuss the role that intent based interfaces could serve in building a big control platform in the coming years.

Keith Winstein introduces the talk.

The first part of the talk is an introduction to intent-based software-defined networking, which Dave summarizes as, “Don't tell me what to do, tell me what you need.” Intent describes the problem, whereas traditional CLI commands describe the solution to the problem. In intent, you build a model of the type of workloads you have and let a piece of software decide how to implement it. Dave gives an analogy to health care, comparing a request for an aspirin (the status quo in networking) to a request to cure a headache (intent-based networking). An intent-based system breaks networking into “intent” and “mapping” layers, where the mapping layer is the one that resolves the intent into protocols and other implementation-level details.

One of the big benefits of intent comes from the ability to run more than one service. A lot of existing SDN controllers can support a single application. Attempting to run more than one application tends to cause them to interfere with each other, since each one tries to modify switch state, such as flow tables, without respecting the other's needs, a “multiple writers” problem.

The second part of the talk, which starts about half an hour in, describes a set of open problems. Many of these reference the Big Control Platform, an ongoing Platform Lab project that seeks to build infrastructure for controlling collaborative device swarms.

The talk ends with over 20 minutes of questions, beginning at 38:04, from Keith Winstein, Mendel Rosenblum, Bob Lantz, and others.

Episode 21: Container Integration with OVS, with Gurucharan Shetty from VMware (Dec 13, 2016)

Guru Shetty has been working at VMware, and before that at Nicira, since about 2011. Guru has done a lot of work to integrate Open vSwitch with various platforms. This episode is a conversation with Guru primarily about integrating Open vSwitch and OVN with multiple container systems, including Docker and Kubernetes.

Guru begins by reiterating the rationale for containers from an application developer point of view: to capture and encapsulate dependencies into usable pieces and thereby make the software easier to consume. This is not so different from a VM, conceptually, but containers have some advantages in terms of size and life-cycle management. Microservices, on the other hand, are a way of building an application from many independently deployed and upgradeable components that interact through well-defined APIs. The components, which are the microservices themselves, could be housed on physical machines or in VMs but are usually in containers.

To introduce the topic of how Open vSwitch integrates with container systems, Guru begins by describing how Open vSwitch integrates with any kind of virtualization. This is divided into three parts. First, the virtualization system must attach its virtual interfaces to Open vSwitch. This is usually simple: for a system that already integrates with the Linux bridge, for example, it might entail calling into the OVS ovs-vsctl program instead of the bridge's brctl. Second, the virtualization system needs to provide some metadata to allow a network controller to identify the relationship between VIFs and VMs. These first two pieces do not differ much between VM and container virtualization.

The third part of integrating a virtualization system with Open vSwitch is a matter of matching up the virtualization system's networking concepts with those understood by the network controller. OpenStack, for example, has switch and router concepts and supports various ways to connect virtual networks to external physical ones. Container systems, though, generally have much simpler ideas of how to interact with networks, perhaps because they are designed by and for application developers who have little interest in operating a network as opposed to simply using one. This makes container systems harder to integrate with featureful virtual networking systems such as OVN, since they simply lack many of the concepts. Guru makes the counterpoint, though, that OpenStack and similar ask application developers to work with a lot of low-level concepts that really have no direct importance to them. Rather, networking should “just work.”

OVS integration with Docker evolved through a few steps. The first step was a form of integration without assistance from Docker itself, through an external wrapper that built a network namespace and configured Docker to use it through a veth pair. This was not ideal, so Guru proposed a native integration framework for Docker. At the same time, there were about a dozen startups and companies all trying to do something similar, which led to a lot of fighting over the approach. Eventually Docker elected to provide a pluggable architecture, although its form is still unsatisfactory for many purposes and thus a number of vendors continue to use a “side-band” or wrapper approach. A form of the latter has been standardized as CNI, which is what Kubernetes and CoreOS and Mesos use.

Guru has also implemented integration between Open vSwitch and Kubernetes, the wildly popular container orchestration system from Google. It approaches networking problems from an application deployer's perspective. For example, an application developer does not care about IP addresses, so Kubernetes hides those details, instead exposing services through higher-level names that are more meaningful to the developers. Kubernetes also exposes every application through a load balancer that allows the application to scale by increasing or decreasing the number of containers behind the load balancer, in a manner hidden from the application's clients. Guru implemented Kubernetes load balancing support for OVN through the NAT feature added to recent versions of Open vSwitch.

The OVN support for Kubernetes is implemented in a repository separate from the main Open vSwitch repository, which is unusual for OVN and Open vSwitch. This was done because Kubernetes changes much faster than Open vSwitch, so that OVN-Kubernetes integration needs to change much more quickly too.

One possibility for the future of the OVN-Kubernetes integration is to take advantage of the Open vSwitch support for DPDK to build a very fast north-south gateway for Kubernetes.

Guru talks briefly about the possibilities for Open vSwitch integration with Mesos, Pivotal Cloud Foundry, and Rocket.

Guru talks about how OVN might be a useful component for service function chaining in container environments, where its general concept of network virtualization gives a lot of value versus ad hoc techniques.

Sometimes containers are portrayed as having, compared to VMs, huge numbers per host and high rates of change. Guru has not seen this in practice. While containers are shorter-lived than VMs, the scale is not much higher.

Episode 20: Protocol-Independent FIB Architecture, with Ryo Nakamura from University of Tokyo (Nov 29, 2016)

Ryo Nakamura is a PhD student at The University of Tokyo, studying IP networking, overlay networking, and network operation. This episode is a recording I made of his talk during APSys 2016, the Asia-Pacific Workshop on Systems, on Aug. 5, based on Protocol-Independent FIB Architecture for Network Overlays, written with co-authors Yohei Kuga (from Keio University), Yuji Sekiya, and Hiroshi Esaki.

The abstract for this paper says:

We introduce a new forwarding information base architecture into the stacked layering model for network overlays. In recent data center networks, network overlay built upon tunneling protocols becomes an essential technology for virtualized environments. However, the tunneling stacks network layers twice in the host OS, so that processing to transmit packets increases and throughput will degrade. First, this paper shows the measurement result of the degradation on a Linux kernel, in which throughputs in 5 tunneling protocols degrade by over 30%. Then, we describe the proposed architecture that enables the shortcut for the second protocol processing for network overlays. In the evaluation with a dummy interface and a modified Intel 10-Gbps NIC driver, transmitting throughput is improved in 5 tunneling protocols and the throughput of the Linux kernel is approximately doubled in particular protocols.

Before the talk, session chair Florin Dinu introduces the speaker. Following the talk, the questions come from Ben Pfaff, Sorav Bansal, and Florin, respectively. Sorav's question refers to my own talk from earlier the same day at the conference, which is published as OVS Orbit Episode 14.

Episode 19: The Faucet SDN Controller, with Josh Bailey from Google and Shivaram Mysore from ONF (Nov 13, 2016)

Faucet is an open source SDN controller developed by a community that includes engineers at Google's New Zealand office, the Open Networking Foundation (ONF), and others. This episode is an interview with Josh Bailey from Google and Shivaram Mysore from the ONF. It was recorded on Nov. 7, at Open vSwitch 2016 Fall Conference.

The episode begins with a description of Faucet's goals. Unlike the higher profile Open Daylight and ONOS controllers, which focus on performance at high scale, Faucet places simplicity, ease of development, and small code size as higher purposes.

Also in contrast to most controllers, Faucet does not contain code specific to individual vendors or models of OpenFlow switch. Rather, it targets any OpenFlow 1.3 switches that fulfill its minimum multi-table and other requirements, using a pipeline of tables designed to be suitable for many purposes. In Josh's words, “The most important one was tables. Once you have tables, you can say 'if-then'. If you don't have tables, you can only go 'if-and-and-and-and'.”

Faucet development has focused on deployments. Several Faucet users have come forward to publicly talk about their use, with the highest profile of those being the Open Networking Foundation deployment at their own offices. See also a map of public deployments. Shiva describes a temporary deployment at the ONF Member Workdays for conference wi-fi use.

Performance is not a focus for Faucet. Instead, developers encourage users to experiment with deployments and find out whether there is an actual performance in practice. Shivaram reports that this has worked out well.

Faucet can control even very low-end switches, such as the Zodiac, a 4-port switch from Northbound Networks that costs AUD $99 (about USD $75). Faucet itself has low memory and CPU requirements, which mean that it can run on low-end hardware such as Raspberry Pi (about $30), which has actually been deployed as a production controller for enterprise use.

Last summer, the ONF hosted a Faucet hackfest in Bangalore, where each team was supplied its own “Pizod,” a combination of a Zodiac and Raspberry Pi, for development. Hackers at the hackfest were required to have Python experience, but not networking or OpenFlow experience. Each team of 4, which included a documentation and a UX person, chose a project from an assigned list of possibilities.

Faucet records the state of the system, over time, to an InfluxDB database and exposes that for inspection through a Grafana dashboard.

The Faucet code is small, about 2,500 lines of code. About this size, Josh says, “I'd be surprised if it gets about four times the size, because we've got quite a clear idea of its scope... Think of Faucet as your autonomic nervous system, a small important part of your brain but it keeps you breathing and it reacts to high-priority threats before your conscious mind sets in. You keep that code small and you test the heck out of it.”

Josh is working on extending support for distributed switching within Faucet. Troubleshooting large L2 fabrics is especially frustrating, and Josh aims to make it easier. Shiva is encouraging deployments, especially feedback from deployments, and control over wi-fi. Other priorities are better dashboards and better IPv6 support.

For more information on Faucet, visit the Faucet blog, read the ACM Queue article on Faucet, dive into the Faucet Github repo, or search for “Faucet SDN” on Youtube.

Episode 18: OVN Launch, with Russell Bryant from Red Hat (Oct 28, 2016)

OVN is a network virtualization system that has been under development as part of the Open vSwitch project over about the last two years. On this podcast, Ben Pfaff and Russell Bryant, two major contributors to OVN, describe OVN, its architecture, its features, focusing on features that were added in the recent release of Open vSwitch 2.6, and some future directions.

This episode is based on the material presented at the OpenStack Summit in the session titled “OVN - Moving to Production.” The summit talk was recorded and video and slides are available. This podcast follows the structure of the slides pretty closely, making them a useful companion document to look at while listening, but the episode is meant to stand alone.

Resources mentioned in this episode:

Discussion of databases for OVN: ovs-dev list archive.
Continuous integration keynote, “Demoing the World's Largest Multi-Cloud CI Application”: video.
OVN SFC talk, “Delivering OpenStack NFV Service Chaining at Scale with Networking-SFC and Networking-OVN”: video.
Open vSwitch/OVN Git repository at Github.
Open vSwitch development and general discussion mailing lists. (You can also reach the list archives from these pages.)
ovn-kubernetes integration Git repository at Github.
Russell Bryant's blog.
Open vSwitch 2016 Fall Conference, to be held Nov. 7 and 8 in San Jose California, details, agenda, and (until Nov. 7) online registration.

Russell Bryant is a software developer in Red Hat's Office of the CTO. You can find him at russellbryant.net or on Twitter as @russellbryant.

Episode 17: Debugging OpenStack Problems using a State Graph Approach, with Yong Xiang from Tsinghua University (Oct 13, 2016)

Yong Xiang is a PhD student at Tsinghua University. This episode is a recording of his talk during APSys 2016, the Asia-Pacific Workshop on Systems, on Aug. 5, based on Debugging OpenStack Problems Using a State Graph Approach, written with co-authors Hu Li, Sen Wang, Charley Peter Chen, and Wei Xu, which was awarded “best paper” at the conference. A preprint of the paper is also available at arxiv.org.

Slides from the talk are available. It is probably easier to follow the talk if you have the slides available, but it is certainly not necessary.

This is a very practical paper that seeks ways to make it easier for non-experts to troubleshoot and debug an OpenStack deployment. Its abstract is:

It is hard to operate and debug systems like OpenStack that integrate many independently developed modules with multiple levels of abstractions. A major challenge is to navigate through the complex dependencies and relationships of the states in different modules or subsystems, to ensure the correctness and consistency of these states. We present a system that captures the runtime states and events from the entire OpenStack-Ceph stack, and automatically organizes these data into a graph that we call system operation state graph (SOSG). With SOSG we can use intuitive graph traversal techniques to solve problems like reasoning about the state of a virtual machine. Also, using a graph-based anomaly detection, we can automatically discover hidden problems in OpenStack. We have a scalable implementation of SOSG, and evaluate the approach on a 125-node production OpenStack cluster, finding a number of interesting problems.

The first question at the end of the talk comes from me, with an answer assisted by the paper's coauthor Wei Xu, and the second one from Sorav Bansal.

Episode 16: Tunneling and Encapsulation, with Jesse Gross from VMware (Sep 26, 2016)

Tunneling and encapsulation, with protocols from GRE to Geneve, have been a key part of Open vSwitch since the early days. Jesse Gross, an early employee at Nicira and major contributor to Open vSwitch, and perhaps most importantly the maintainer of the Open vSwitch kernel module, joins this episode of the podcast to talk about this aspect of OVS.

The conversation begins with a discussion of the reasons for L2-in-L3 tunnels. Jesse's reasons for such tunnels include adding a layer of indirection between physical and virtual networks. VLANs can provide a solution for partitioning networks, but they don't provide the same layer of indirection.

Jesse describes the motivation for designing and implementing STT encapsulation in Open vSwitch. The biggest reason was performance, primarily the cost of losing the network card hardware support for various kinds of offloads, such as checksum and TCP segmentation offload support. Most network cards can only implement these for specific protocols, so that using an encapsulation not specifically supported by the card caused performance degradation. STT worked around this by using (abusing?) TCP as an encapsulation. Since most network cards can offload TCP processing, this allowed STT encapsulation to be very fast on both the send and receive sides. Jesse also describes the challenges in implementing STT and his view of STT's future.

Whereas STT was designed as a performance hack for existing network cards, Geneve, the second encapsulation that Jesse designed and implemented in Open vSwitch, addresses the paucity of metadata that the existing tunneling protocols supported. GRE, for example, supports a 32-bit key, VXLAN supports a 24-bit VNI, STT a 64-bit key, and so on. None of them supported a large or flexible amount. Geneve, on the other hand, supports an almost arbitrary number of type-length-value (TLV) options, intended to be future-proof. Geneve has been working its way through the IETF for about 3 1/2 years.

Jesse talks about NSH (Network Service Header), which is often mentioned in conjunction with Geneve. NSH has some specialization for service function changing, whereas Geneve takes a more general-purpose stance. NSH does support TLVs, but its primary focus is on a fixed number of fixed-size headers that it keeps in the packet, and that is what most implementations of Geneve actually implement. NSH can used inside L2 or L3, whereas Geneve as currently runs only inside L3. Jesse discusses pros and cons to each design.

Jesse discusses MTU issues in tunneling and encapsulation, which come up because they techniques add bytes to each packet, making packets that are maximum length before encapsulation exceed the MTU. Jesse says that the solution to MTU problems depends on the use case: for example, in data center use cases, a simple solution can be to increase the MTU of the physical network. In the pre-1.10 era, Open vSwitch supported path MTU discovery for tunnels, and Jesse describes why it was dropped and what it would take to reintroduce it.

Jesse describes CAPWAP tunneling, why OVS implemented it, and why OVS dropped support.

Jesse describes GTP tunneling and the potential for including it in OVS, as well as ERSPAN encapsulation.

Jesse describes the challenges of encapsulations for IP (as opposed to encapsulations for Ethernet).

Jesse lays out some thoughts on the future of tunneling in Open vSwitch.

Episode 15: Lagopus, with Yoshihiro Nakajima from NTT (Sep 10, 2016)

Lagopus is a high-performance, open source software switch, primarily for DPDK on Linux, developed at NTT in its Network Innovation Laboratories research group. Lagopus features OpenFlow 1.3 conformance, plus extensions to better support NTT's use cases. This episode is a discussion with Yoshihiro Nakajima, one of the switch's developers, about Lagopus, its history, goals, and future.

Lagopus supports protocols that are particularly important to carriers, such as PBB and MPLS, and includes OpenFlow extensions for general-purpose tunnel support with VXLAN, GRE, and other encapsulations. Yoshihiro talks about how, with DPDK, Lagopus implements some protocols, such as ARP and ICMP, by delegating them to the Linux kernel through TAP devices.

Yoshihiro describes the architecture of Lagopus and how it achieves high performance. It has optimizations specific to the flows that each table is expected to contain; for example, a different lookup implementation for L2 and L3 tables. We talk about how the number of tables in a given application affects performance.

Lagopus targets two main application domains: high-performance switching or routing on bare-metal servers, or high-performance virtual switching for NFV. Some of the latter applications are in a testing phase, aiming for ultimate production deployment.

We discuss some philosophy of SDN (some audio was lost at the beginning of this discussion). The important part of SDN, to Yoshihiro, is to avoid the need to use CLIs to configure switches, instead moving to a “service-defined” model.

We discussed how to fit stateful services into the stateless OpenFlow match and action pipeline model, particularly how to handle the need for sequence numbers in some tunneling protocols such as GRE and GTP.

We talk about the difficulties in forming an open source community around a software switch and attracting contributions from a group outside the immediate organization writing the software. Yoshihiro reports receiving reports from several users, including suggestions for improvement.

Lagopus has a growing worldwide community but some of the outreach from the team has focused on Asia in general and Japan in particular because of lower geographical and communication barriers.

The Lagopus team is currently working on a switch and routing control API that works at a higher level than OpenFlow, based on YANG models.

Episode 14: Converging Approaches to Software Switches (Aug 28, 2016)

On Aug. 4 and 5, I attended APSys 2016, the Asia-Pacific Workshop on Systems. This episode is my own “industry talk” from APSys, titled “Converging Approaches in Software Switches: Combining code- and data-driven approaches to achieve a better result.” Slides from the talk are available and may provide a little extra insight, but it is not necessary to view them to follow along with with the talk.

This talk introduces the idea that software switches can be broadly divided in terms of their architecture into two categories: “code-driven” switches that call a series of arbitrary functions on each packet, and “data-driven” switches that use a single engine to apply actions selected from a series of tables. This talk explains the two models and the usefulness of the categorization, and explain how hybrids of the two models can build on the strengths of both.

In the past, people have asked me to compare Open vSwitch to other software switches, both architecture- and performance-wise. This talk is the closest that I plan to come to a direct comparison. In it, I cover a key architectural difference between Open vSwitch and most other software switches, and I explain why that architectural difference makes a difference for benchmarks that authors of many software switches like to tout.

This talk includes a very kind introduction from Sorav Bansal, assistant professor at IIT-Delhi, as well as several questions and answers interleaved, including some from Sorav and some from others' whose names I did not catch.

Episode 13: Time Capsule, with Jia Rao and Kun Suo from University of Texas at Arlington (Aug 20, 2016)

On Aug. 4 and 5, I attended APSys 2016, the Asia-Pacific Workshop on Systems. I was impressed with how many of the papers presented there were relevant to Open vSwitch and virtualization in general. This episode is an interview with Jia Rao and Kun (Tony) Suo of the University of Texas at Arlington, to talk about their APSys paper, Time Capsule: Tracing Packet Latency across Different Layers in Virtualized Systems, which received the conference's Best Paper award.

The paper's abstract is:

Latency monitoring is important for improving user experience and guaranteeing quality-of-service (QoS). Virtualized systems, which have complex I/O stacks spanning multiple layers and often with unpredictable performance, present more challenges in monitoring packet latency and diagnosing performance abnormalities compared to traditional systems. Existing tools either trace network latency at a coarse granularity, or incur considerable overhead, or lack the ability to trace across different boundaries in virtualized environments. To address this issue, we propose Time Capsule (TC), an in-band profiler to trace packet level latency in virtualized systems with acceptable overhead. TC timestamps packets at predefined tracepoints and embeds the timing information into packet payloads. TC decomposes and attributes network latency to various layers in the virtualized network stack, which can help monitor network latency, identify bottlenecks, and locate performance problems.

The interview covers the basic idea behind Time Capsule, the mechanism that it uses, techniques for comparing clocks of different machines across a network, and how it helps users and administrators track down latency issues in a virtual network, with reference to a specific example in the paper that shows the advantage of the fine-grained latency monitoring available in Time Capsule. “You can find some interesting results that are totally different from the results you get from coarse-grained monitoring.”

Other topics include comparison against whole-system profilers such as Perf or Xenoprof, the overhead of using Time Capsule, how many tracepoints are usually needed, how to decide where to put them, and how to insert a tracepoint.

There is a brief discussion of the relationship between Time Capsule and In-Band Network Telemetry (INT). Time Capsule focuses on virtualization, timing, and network processing within computer systems, whereas INT tends to focus more on switching and properties of the network such as queue lengths.

Time Capsule has not yet been released but it will be made available in the future. For now, the best way to learn more is to read the paper. Readers who want to know more can contact the authors at the email addresses listed in the paper.

The authors are using Time Capsule as the basis for continuing research into the performance of virtualized systems.

Time Capsule has some limitations. For example, it is limited to measurements of latency, and it cannot record packet drops. It also, currently, requires tracepoints to be inserted manually, although eBPF might be usable in the future.

Episode 12: Open vSwitch Joins Linux Foundation (Aug 10, 2016)

On August 9, Open vSwitch joined the Linux Foundation as a Linux Foundation Collaborative Project, as previously discussed on ovs-discuss.

This episode is a recording of a conference call held by the Open vSwitch developers on August 10 to talk about this move, what will change and what will not change as a result, and open up for Q&A. Justin Pettit and Ben Pfaff are the main speakers in the call. You will also hear comments and questions from Simon Horman from Netronome and Mike Dolan from the Linux Foundation.

Episode 11: P4 on the Edge, with John Fastabend from Intel (Aug 9, 2016)

Interview with John Fastabend, an engineer at Intel whose work in the Linux kernel has focused on the scheduling core of the networking stack and Intel NIC drivers. John has also been involved in IEEE standardization of 802.1Q and Data Center Bridging (DCB).

The interview focuses on John's recent work on P4 for edge devices, which he presented at the P4 Workshop held at Stanford in May. The slides for his talk are available.

John's work originated in the use of P4 as a language for describing the capabilities of Intel NICs, as an alternative to thousand-page manuals written in English. He moved on to explore ways that software can be offloaded into hardware, to improve performance and of course to make Intel's hardware more valuable. That led to the use of P4 to describe software as well, and eventually to the question that kicked off his talk, “Is P4 a useful abstraction for an edge node?” where an edge node in this case refers to a server running VMs or containers.

The work presented at the P4 conference includes a P4 compiler that generates IR code (that is, portable bytecode) for LLVM, a portable compiler that can generate code for many architectures and that is designed to be easily amenable to extensions. John then used an existing backend to LLVM to generate eBPF code that runs inside the Linux kernel on any architecture through an in-kernel just-in-time (JIT) compiler.

John used this infrastructure to answer a few different questions. Is P4 expressive enough to build a virtual switch? Does eBPF have enough infrastructure to implement a virtual switch? The answer in each case appears to be “yes.”

The runtime interface to the eBPF P4 programs works through eBPF maps. John's work include tools for populating maps, including command-line and NETCONF interfaces.

John is exploring the idea of using Intel AVX instructions to accelerate packet processing. He also points out that the JIT can actually be an asset for performance, rather than a liability, if it can specialize the code to better run on particular hardware. The well-established JITs for Java and Lua might point in the right direction.

John describes the performance advantages of XDP (Express Data Path) for processing packets that do not need to go to the operating system without constructing a full Linux sk_buff data structure.

The main application of this work, so far, has been to experiment with software implementations of hardware. John is also experimenting with a load balancer and a connection tracker.

John's work is all in the context of the Linux kernel. He speculates on how it could be applied to a switch running on DPDK in userspace. In such an environment, it might make sense to have LLVM compile directly to native code instead of via eBPF.

John talks about P4-specific optimizations to improve P4 programs that are written in a way that is difficult to implement efficiently in eBPF.

John and Ben discuss some of the differences between software and hardware implementations of P4.

John describes two models for network processing in software. In the “run-to-completion” model, a packet is processed from ingress to egress on a single core. In the “pipeline” model, the packet passes from one core to another at multiple stages in its processing. DPDK supports both models. John and Ben both have the intuition that the run-to-completion model is likely to be faster because it avoids the overhead of passing packets between cores, and they discuss why there might be exceptions.

The next step is performance testing and optimization, gathering users, and moving to P4 2016.

John and Ben discuss related work in P4 and eBPF. Thomas Graf's eBPF-based work on Cilium, discussed in Episode 4, leans more toward orchestration and scale over a large system than as a general-purpose switch. Ethan Jackson's work on SoftFlow, discussed in Episode 10, is more about how to integrate state with Open vSwitch. Muhammad Shahbaz's work on integrating P4 into Open vSwitch, discussed in Episode 9, can benefit from John's experience using LLVM.

If you're interested in experimenting with the prototype that John has developed, or if you have other questions for him, the best way to contact him is via email.

Episode 10: SoftFlow, with Ethan Jackson from Berkeley (Jul 20, 2016)

Interview with Ethan Jackson, a PhD student at Berkeley advised by Scott Shenker. Before Berkeley, Ethan worked on Open vSwitch as an employee at Nicira Networks and then at VMware. His contributions to Open vSwitch have greatly slowed since he moved on to Berkeley, but as of this writing Ethan is still the second most prolific all-time contributor to Open vSwitch measured in terms of commits, with over 800.

Ethan talks about his experience implementing CFM and BFD protocols in Open vSwitch. He found out that, whenever anything went wrong in a network, the first thing that found the problem was CFM (or BFD), and so that was always reported as the root of the problem:

“Every bug in the company came directly to me, and I got very good at debugging and pointing out that other people's code was broken... That's really how I progressed as an engineer. Being forced to debug things makes you a better systems person.”

The body of the interview is about SoftFlow, a paper published at USENIX ATC about integrating middleboxes into Open vSwitch. The paper looks at the spectrum of ways to implement a software switch, which currently has two main points. At one end of the spectrum is the code-driven Click-like model where each packet passes through a series of black box-like stages. At the other end is the data-driven Open vSwitch model, in which a single code engine applies a series of packet classifier based stages to a packet.

The data-driven model has some important advantages, especially regarding performance, but it's really bad at expressing middleboxes, particularly when state must be maintained between packets. SoftFlow is an attempt to bring Click-like functionality into an Open vSwitch world, where firewalls and NATs can be expressed and OpenFlow functionality can be incorporated where it is appropriate as well.

Part of the problem comes down to safety. It's not reasonable to trust all vendors to put code directly into the Open vSwitch address space, because of code quality and trust issues. The common solution, in an NFV environment, is to put each virtual network function into its own isolated virtual machine, but this has a high cost in performance and other resources.

SoftFlow is an extension to OpenFlow actions. Traditionally, actions are baked into the switch. SoftFlow allows a third party to augment actions in the switch via a well-defined interface. Actions are arbitrary code that can perform pretty much anything, but the intention is that they should integrate in well-defined ways with OpenFlow. For example, a firewall has a need for packet classification, which is easily and naturally implemented in OpenFlow, but a connection tracker, that cannot be expressed in OpenFlow, might be expressed in SoftFlow and then integrated with OpenFlow classifiers. The paper talks about a number of these SoftFlow features.

Ethan contrasts connection tracking via SoftFlow against the Linux kernel based connection tracking that has been recently integrated into Open vSwitch. According to Ethan, the value of SoftFlow for such an action is the infrastructure. Kernel-based connection tracking required a great deal of infrastructure to be built up, and that infrastructure can't necessarily be reused for another stateful action. However, SoftFlow itself provides a reusable framework, simplifying development for each new action built with it.

Ethan explains a firewall example in some detail.

The paper compares the performance of SoftFlow to various alternate implementation, with a focus on Open vSwitch. They measured several pipelines with various traffic patterns and compared a SoftFlow implementation to a more standard NFV implementation with Open vSwitch as a software switch and the virtual functions implemented as virtual machines. SoftFlow provided a significant performance gain in this comparison.

Ethan describes why he is skeptical of performance measurements of NFV systems in general: first, because they generally measure trivial middleboxes, where the overhead of the actual middlebox processing is negligible, and second, because they focus on minimum-length packets, which may not be realistic in the real world.

Ethan talks about hardware classification offload. This is a general Open vSwitch feature, not actually specific to SoftFlow. Open vSwitch does a packet classification for every packet in the datapath, which is expensive and the bulk of the cost of Open vSwitch packet forwarding. NICs from Intel and Broadcom and others have TCAMs that can perform packet classification in hardware much more quickly than software. These TCAMs have significant limitations but the paper describes how these can be overcome to obtain major speedups for software switching. (This is an area where Open vSwitch's architecture gives it a major advantage over one with an architecture like Click.)

Ethan's current project is Quilt, a container orchestration system whose goal is to find the right model for expressing distributed systems. Quilt assumes the flexibility provided by network virtualization systems and explores how a system built on this flexibility should be architected. It uses a declarative programming language to describe a distributed system and includes software to implement and maintain a system described using the language. The system is designed to be easy to deploy and use with popular distributed systems such as Apache Spark.

You can reach Ethan via email at his website, ej2.org.

OVS Orbit is produced by Ben Pfaff. The intro music in this episode is Drive, featuring cdk and DarrylJ, copyright 2013 by Alex. The bumper music is Yeah Ant featuring Wired Ant and Javolenus, copyright 2013 by Speck. The outro music is Space Bazooka featuring Doxen Zsigmond, copyright 2013 by Kirkoid. All content is licensed under a Creative Commons Attribution 3.0 Unported (CC BY 3.0) license.

Episode 9: Adding P4 to OVS with PISCES, with Muhammad Shahbaz from Princeton (Jun 25, 2016)

Interview with Muhammad Shahbaz, a third-year grad student at Princeton advised by Jennifer Rexford and Nick Feamster. Shahbaz talks about his work on PISCES, a version of Open vSwitch modified to add support for P4, a language for programming flexible hardware switches, which will be presented at SIGCOMM in August 2016. Shahbaz is spending this summer as an intern at VMware, where he is working to bring PISCE's features into a form where they can be integrated into an upstream Open vSwitch release.

A P4 program specifies a number of different aspects of a switch: how packets are parsed, their processing as they pass through a series of table, and how the packets are reassembled (“deparsed”) when they egress the switch.

From an Open vSwitch person, the main way that P4 differs from OpenFlow in that it allows the user to specify the protocols to be used. Any given version of Open vSwitch, when controlled over OpenFlow, is essentially a fixed-function switch, in the sense that it supports a specific set of fixed protocols and fields, but when P4 is integrated into Open vSwitch a network developer can easily add and remove and customize the protocols that it supports.

Modifying C source code and modifying P4 source code are both forms of programming, but P4 source code is much smaller and much more in the “problem domain” for network programming, and thus more programmer-efficient. Because P4 programs tend to be simple and problem domain specific, this also allows end users who want special features but don't have strong C programming skills to add features. Shahbaz quotes some measurements on difference in code size: 20x to 40x reduction in code size when a pipeline is implemented in P4 rather than C.

One must trade some costs for these improvements. In particular, it is a challenge to make P4 perform well in Open vSwitch because the P4 abstract forwarding model is not an exact match for the Open vSwitch or OpenFlow abstract forwarding model. For this reason, the initial PISCES prototype had a 40% performance overhead over regular Open vSwitch for a simple L2/L3 routing application. With a number of optimizations, including those around field updates and checksum verification and update, the penalty was reduced to about 3%, and Shahbaz is optimistic that it can be made faster still, perhaps faster than the current OVS code. The optimizations both reduced the cost in the Open vSwitch “fast path” cache and increased the hit rate for the cache.

The quoted 40% and 3% performance hits for PISCES are actually comparisons against Open vSwitch with its microflow cache disabled, which is not a normal way to run Open vSwitch. This is because PISCES does not yet have a way to specify how to compute the hash used for indexing the microflow cache; in plain Open vSwitch, this hash is computed in (protocol-dependent) NIC hardware, whereas in PISCES it would need to be computed in software.

Shahbaz mentioned that PISCES may be used in the next iteration of Nick Feamster's Coursera course on Software-Defined Networking and talks about the target audience for the course.

Work for the summer, besides getting some of this work into OVS, includes looking into some more advanced features like P4 stateful processing features such as counters, meters, and registers. Ethan Jackson's SoftFlow paper recently presented at USENIX ATC is also relevant to stateful processing in OVS.

To find out more about PISCES or to contact Shahbaz, visit its website at Princeton, which includes a preprint of the paper and links to the Git repository with source code. You can also view slides and video that Shahbaz presented about an early version of PISCES at Open vSwitch 2015 Fall Conference.

Episode 8: Mininet, with Bob Lantz and Brian O'Connor from ON.LAB (Jun 18, 2016)

Interview with Bob Lantz and Brian O'Connor of ON.LAB, about Mininet software for simulating networks.

Bob previously gave a talk about Mininet (slides, video) at the Open vSwitch 2015 Fall Conference.

Bob describes the mission of ON.LAB and how he ended up there. He talks about introducing the idea of a network operating system to ON.LAB. He mentioned that his interest in networks arose from a lecture by Nick McKeown in the EE380 lecture series at Stanford, in which Nick stated: “Networks are like hardware without an operating system,” which piqued Bob's interest.

Brian relates his own experience getting involved with SDN, Mininet, and ON.LAB.

Bob describes the genesis of Mininet by analogy to mobile device development. Mobile device development is a pain because no one wants to spend all their time with these tiny devices, so you use a simulator. For network development, you need a simulator too because otherwise you need a huge stack of expensive hardware. Mininet was directly inspired by a network namespaces-based simulator developed in-house at Arista for testing EOS.

Bob compares Mininet to Docker and other container systems. All of these are container orchestration systems that make use of the “namespace” and control group (cgroup) features of the Linux kernel. Mininet gives more control over the network topology than the others.

Bob talks about limitations in OpenStack networking and what he'd like to see OpenStack support in networking.

Brian describes a trend in NFV toward minimization, that is, reducing the amount of overhead due to VMs, often by running in containers instead. He speculates that containers might later be considered too heavyweight. In Mininet, isolation is à la carte: the aspects of network isolation, process isolation, and so on can all be configured independently, so that users do not experience overhead that is not needed for a particular application.

Bob talks about the scale that Mininet can achieve and that users actually want to simulate in practice and contrasts it against the scale (and particular diameter) of real networks. Versus putting each switch in a VM, Bob says that Mininet allows for up to two orders of magnitude scale improvement. His original vision was to simulate the entire Stanford network of 25,000 nodes on a rack of machines. Bob talks about distributed systems built on Mininet, which are not officially integrated into Mininet. Distributed Mininet clusters are a work in progress. In general, Mininet scales better than most controllers.

Bob compares Mininet to ns3. ns3 was originally a cycle-accurate simulator, but this made it hard to connect to real hardware and run in real time, so it has moved in a direction where it works in a mode similar to Mininet.

Bob describes the Mininet development community, based on github pull requests. Bob describes a paradox in which they'd like to accept contributions but most of the patches that they receive are not of adequate quality.

Bob talks about performance in OVS related to Mininet, as a review of his previous talk, and especially related to how Mininet speaks to OVSDB. The scale of Mininet doesn't interact well with the design of the OVS command-line tool for configuring OVS, which doesn't expect thousands of ports or perform well when they are present. Bob reports that creating Linux veth devices is also slow.

Bob describes how to generate traffic with Mininet: however you like! Since you can run any application with Mininet, you can generate traffic with any convenient software.

Brian's wish list: improving the support for clustering Mininet and the ability to “dilate time” to make Mininet simulation more accurate to specific hardware, and the ability to model the control network.

You can contact Brian via email. Bob recommends emailing the Mininet mailing list to get in contact with him.

Episode 7: The OVS Development Process, with Kyle Mestery from IBM (Jun 11, 2016)

Interview with Kyle Mestery, a Distinguished Engineer at IBM who has been involved with Open vSwitch since about 2012, about the Open vSwitch development process. Our conversation was based on Upstream Open Source Networking Development: The Good, The Bad, and the Ugly, a presentation at ONS 2016 given by Kyle along with Justin Pettit from VMware and Russell Bryant from Red Hat. Kyle also gave a version of the talk with Armando Migliaccio at OpenStack Austin. The latter talk was recorded on video.

The focus of the conversation is to present the Open vSwitch development process by comparing it against the process for OpenStack Neutron and OpenDaylight. All three project names begin with “Open,” but there are significant differences in how they develop code!

How do these projects communicate? All of them have mailing lists, although there are subtle differences in how they use them. Open vSwitch has two main lists, ovs-discuss and ovs-dev. OpenStack, despite being a much bigger project, has only a single development mailing list that it divides up using bracketed “topic tags” supported by the GNU Mailman mailing list manager. OpenDaylight, finally, has many mailing lists per subproject. Kyle explains the advantages and disadvantages of each approach.

All of these projects have IRC channels also. Open vSwitch has a single channel #openvswitch and the other projects have multiple, subproject-specific channels.

OpenDaylight stands out as the only project among the three that relies heavily on conference calls.

Are the projects friendly to newcomers? In general, Kyle thinks so. As with any project, regardless of open or closed source, there will be some existing developers who are super-helpful and others who are overworked or overstressed and less helpful initially. In the end, how you cycle through leader and contributors in a project is how the project grows.

The projects handle bugs differently as well. Open vSwitch primarily handles bugs on the mailing list. OpenStack files bugs in Launchpad using a carefully designed template. OpenDaylight has a Bugzilla instance and a wiki with instructions and advice. Kyle thinks that Open vSwitch may need to make heavier use of a bug tracker sometime in the future.

The projects have different approaches to code review. OpenDaylight and OpenStack use Gerrit, a web-based code review system, although many developers do not like and avoid the Gerrit web interface, instead using a command-line tool called Gertty. Open vSwitch primarily uses patches emailed to the ovs-dev mailing list, similar to the Linux kernel patch workflow. In-flight patches can be monitored via Patchwork, although this is only a tracking system and has no direct control over the Open vSwitch repository. Open vSwitch also accepts pull requests via Github.

Kyle mentions some ways that the Open vSwitch development process might benefit from approaches used in other projects, such as by assigning areas to particular reviewers and dividing the project into multiple, finer-grained repositories. OVN, for example, might be appropriate as a separate project in the future.

Kyle's advice: plan ahead, research the projects, give your developers time to become comfortable with the projects, treat everyone with respect, treat everyone equally, and give back to the core of the project. Keep in mind that little maintenance patch are as important as huge new features. Finally, trust your developers: you hired good people, so trust their ability to work upstream.

The interview also touches on:

How Kyle became involved with Open vSwitch when he was a software engineer at Cisco and how his role at IBM has shifted a little more toward management.
The “Wild West” explosion of open source software in networking in the last few years.
The four stages of how companies get involved in open source networking projects: excitement, panic, enlightenment, success. The key to enlightenment, Kyle says, is that you get out what you put in, which includes a “karma cycle” of helping to get other developer's code in, e.g. through code review.
The importance of giving developers credit for putting time into reviewing and testing code, and how different projects do it, plus the pitfalls in using karma-like systems that can be gamed to the point of becoming “poisonous.”
“Onboarding” in projects and how a developer becomes a “core team” member or “committer.”

You can reach Kyle as @mestery on Twitter and follow his blog at siliconloons.com.

Episode 6: sFlow, with Peter Phaal from InMon (Jun 2, 2016)

Interview with Peter Phaal of InMon, about sFlow monitoring and how it is used with Open vSwitch. In summary, an sFlow agent in a switch (such as Open vSwitch or a hardware switch) selects a specified statistical sample of packets that pass through it, along with information on how the packet was treated (e.g. a FIB entry in a conventional switch or OpenFlow actions in Open vSwitch) and sends them across the network to an sFlow collector. sFlow agents also periodically gather up interface counters and other statistics and send them to collectors. Data collected from one or more switches can then be analyzed to learn useful properties of the network.

Peter begins with a description of the history of sFlow, including its pre-history in network monitoring products that Peter was involved in at HP Labs in Bristol. At the time, network monitoring did not require a special protocol such as sFlow, because networks were based on a shared medium to which any station could listen. With the advent of switched networks, the crossbar inside each switch effectively became the shared medium and required a protocol such as sFlow to look inside.

Peter compares the data collected by sFlow to a “ship in a bottle,” a shrunken model of the network on which one can later explore route analytics, load balancing, volumetric billing, load balancing, and more. He says that SDN has empowered users of sFlow by providing a control plane in which one can better act on the information obtained from analytics:

“If you see a DDoS attack, you drop a filter in and it's removed from the network. If you see a large elephant flow taking a path that's congested, you apply a rule to move it to an alternative path. So it really unlocks the value of the analytics, having a control plan that's programmable, and so I think the analytics and control really go hand-in-hand.”

sFlow can be used in real time or for post-facto analysis. The latter is more common historically, but Peter thinks that the potential for real-time control are exciting current developments.

In contrast to NetFlow and IPFIX, sFlow exports relatively raw data for later analysis. Data collected by sFlow can be later converted, approximately, into NetFlow or IPFIX formats.

Episode 5: nlog, with Teemu Koponen from Styra and Yusheng Wang from VMware (May 26, 2016)

Interview with Teemu Koponen of Styra and Yusheng Wang of VMware, about the nlog language.

nlog, in this context, is unrelated to the logging platform for .NET. It is a database language, a simplified form of Datalog that lacks recursion and negation. Teemu designed this language for use in Nicira NVP, the forerunner of VMware NSX-MH. Yusheng is now working to implement nlog in OVN.

Teemu and Yusheng begin by describing the nlog language, its name (the “N” stands for “Nicira.”), and its purpose and contrast it with more commonly known languages such as SQL. An nlog (or Datalog) program consists of a series of queries against input table that produce new tables, which can be reused in subsequent queries to eventually produce output tables.

In a network virtualization system such as NVP or OVN, input tables contain information on the configuration or the state of the system. The queries transform this input into flow tables to push down to switches. The nlog program acts a function of the entire contents of the input tables, without reference to a concept of time or order. This simplifies implementation, because it avoids ordering problems found so pervasively in distributed systems. Thus, versus hand-code state machines, nlog offers better hope of correctness and easier quality assurance, since it allows programmers to specify the desired results rather than all of the possible state transitions that could lead there.

Topics include:

Related (more complicated) work in academia.
External functions for mapping output.
Query planning in NVP and in OVN.
Sharding, threading, and performance.
Where Yusheng is planning to first propose nlog for use in OVN.
The simple Java-based network virtualization system that Yusheng built to demonstrate the idea.
The patches that we should expect to see soon from Yusheng.
Code size for nlog implementations (small!).
Strategies for testing an nlog implementation.
Data types in nlog, and risks of asynchronous interfacing
Convergence, performance, and transactions.
Lessons learned:
1. Only implement a DSL if you know what you're getting in for.
2. nlog solved correctness issues, period (leaving developers to worry about scale).

You can reach Teemu at koponen@styra.com and Yusheng at yshwang@vmware.com.

Episode 4: Cilium, with Thomas Graf from Cisco (May 21, 2016)

Interview with Thomas Graf of Cisco, regarding the Cilium project.

Cilium is a “science project” that Thomas and others at Cisco and elsewhere are hacking on, to address the question of how to address policy in a legacy-free container environment that scales to millions of endpoints. It's an experiment because the outcome isn't yet certain, and it's a question that hasn't seen much work outside of hyperscale providers.

Cilium is based on eBPF, a Linux kernel technology that introduces the ability for userspace to inject custom programs into the kernel using a bytecode analogous to Java virtual machine bytecode. Cilium uses eBPF-based hooks can intercept packets at various places in their path through the kernel to implement a flexible policy engine.

Topics include:

How Chris Wright encouraged Thomas to become involved with Open vSwitch, when Thomas was at Red Hat.
The important differences between containers and VMs (quantity, duration of workloads, and frequency of state changes) and how Cilium addresses these issues.
The toolchain that Cilium uses to generate a eBPF program customized for the policy of each container in a minimal and complete way.
The potential to bypass kernel sk_buff overhead by intercepting packets before these data structures are constructed, via Express Data Path (XDP), leading to the potential for DPDK-like packet forwarding performance without ever leaving the kernel. What's the drawback? We don't know yet how it will play out.
The main benefit of getting XDP-based early access to packets is to avoid the main Linux networking code paths, which are optimized for delivery to socket buffers (taking advantage of segmentation offload) as opposed to forwarding.
Dropping packets quickly is important because many operators are under attack all the time.
Importance of performance for small and large packets.
Languages available for writing eBPF: C via LLVM/Clang and Python (among others?).
Limitations due to the verifier (primarily restrictions on loops), and how to work around them.
How Cilium generates its eBPF programs: a base C program, plus an agent in Go that generates a C header file. Cilium compiles the C program to eBPF bytecode, using LLVM, then loads it into the running kernel with the tc utility.
Potential for difficulty in getting a compiler toolchain into production deployments. Is the simplicity of supplying Cilium as a container image that builds in the toolchain an advantage?
How often eBPF programs need to be recompiled in Cilium.
eBPF “map” data structures that can be shared among eBPF programs, the rest of the kernel, and userspace.
Policy in Cilium. The basic idea is that whoever specifies Cilium policy should not have to understand traditional networking concepts like IP addresses ands port. Instead, abstract labels specify which classes of containers can talk to each other.
Lessons learned from policy in Cilium.
How Cilium does datapath packet processing, how it passes labels from source to destination, and where it applies policy.
The direction in which Cilium points for eBPF support in Open vSwitch: first, it shows that it is possible; second, it shows that the tracing buffer mechanism available from eBPF is a potential replacement for Open vSwitch “upcalls” currently implemented via Netlink; third, it points out an alternative for the flow-based model. (Does it makes sense to implement OVN directly via code generation?)
Connection tracking in eBPF.
eBPF helper functions in Linux, and the limitations of the current ones.
Potential for applying eBPF to other targets such as DPDK or the OVS port to Hyper-V.
Performance penalty for eBPF versus native code.
Early controversy in the kernel community over eBPF when it was introduced.
What's next for Cilium: load balancing, IPsec, IPv4 .

More information about Cilium: slides and the code repository.

You can find Thomas on the ovs-dev mailing list, @tgraf__ on Twitter, or on Facebook.

Episode 3: OVS in Production, with Chad Norgan from Rackspace (May 8, 2016)

Interview with Chad Norgan of Rackspace, about use of Open vSwitch at Rackspace over the years.

Topics include:

Chad's role at Rackspace and what he spends his time on.
Shifts in where he spends his time as Open vSwitch has matured.
Experiments with buying and writing SDN controllers and OpenFlow flows at Rackspace, including a Ryu-based chassis controller.
SDN in two generations of Rackspace cloud, with NVP in the second generation.
Tracing datapath flows using tools in Open vSwitch, using “ovs-appctl ofproto/trace”, and the value of tracing for debugging a running system.
Suggestions for improving “ovs-appctl ofproto/trace”.
Usefulness of Open vSwitch logging.
Significant performance improvements since earlier versions, such as moving logging out of the main thread, megaflows, and eviction handling.
Potential appliance use of DPDK, contrasted against downside for hypervisors of losing a core.
Thoughts about importance of 64-byte packet performance in Rackspace environment, and potential of eBPF work going on at Cisco to help with that.
Performance history of Open vSwitch versions as perceived by Nicira and Rackspace, and its evolution over time as Open vSwitch was exposed to more and more diverse production use cases.
Open vSwitch decision for version 1.11 to completely rewrite everything in terms of megaflows, to achieve reasonable performance in important minority of cases.
Chad's presentation on Open vSwitch performance at the Open vSwitch 2014 Fall Conference.
Open vSwitch as the “incumbent” virtual switch.
Generating good representative traffic for testing.
Ixia hardware for predictable traffic generation, removing a variable from the testing equation.
Desire for a physical hardware switch with the flexibility of Open vSwitch, with P4 switches as a possible route there.
“ovsdb-tool -mmm show-log” as database debugging tool.

Chad can be contacted at @chadnorgan on Twitter, as BeardyMcBeard on the freenode IRC network

Episode 2: OPNFV and OVS, with Dave Neary from Red Hat (May 4, 2016)

Interview with Dave Neary of Red Hat, concerning OPNFV and its relationship with Open vSwitch.

Topics include:

The difference between SFC and NFV.
Importance performance constraints in NFV.
Telcos and NFV, and how telcos approached OpenStack.
How OPNFV bridges the telcos with OpenStack.
What telcos care about, and a definition of “carrier grade.”
Why carrier-grade features matter to everyone.
Downsides and tradeoffs of carrier grade, such as cost and complexity.
Role of Open vSwitch in OPNFV, and why DPDK is important to NFV.
Importance of short 64-byte packets in NFV, e.g. for RTP (Real-time Transport Protocol, for delivering audio and video across networks).
Relationship between SIP and RTP.
Status of DPDK datapath in OVS for OPNFV (still under OpenStack review and in early evaluation, not running in production much).
Importance of vhost-user for NFV.
Why DPDK API/ABI changes cause trouble for Open vSwitch and downstream users, and how symbol versioning in DPDK 2 and later helps.
Problems caused by irregularly scheduled Open vSwitch 2.4 and 2.5 releases.
Intent to branch OVS 2.6 in July.
SFC for OPNFV demo with Tacker.
Progress toward NSH support in OVS, and why OVS support for Geneve should make it easier now.
Geneve standardization progress.
OpenFlow matching for service chains.
Potential pros and cons of NSH and Geneve for service chaining.
VPP and OPNFV, and guesses at pros and cons of VPP versus OVS for different application.
Why Dave is excited about OPNFV.

You can find Dave at @nearyd on Twitter.

Episode 1: Porting OVS to Hyper-V, with Alessandro Pilotti from Cloudbase (May 1, 2016)

An interview with Alessandro Pilotti of Cloudbase, which Alessandro describes as the company that takes care of everything related to Microsoft technologies in OpenStack. The interview focuses on the Open vSwitch port to Hyper-V, to which Cloudbase is a top contributor.

Highlights and topics in this episode include:

An overview of what Cloudbase does and what it works on in OpenStack.
Why Cloudbase started their Open vSwitch port to Hyper-V.
History of the VMware and Cloudbase ports to Hyper-V and how they were merged into upstream Open vSwitch.
Alessandro's thoughts on how to favoring a user experience familiar to Windows admins versus one familiar to user who know Open vSwitch from other platforms.
Perspectives on collaboration in the Hyper-V port development.
The native Hyper-V switch versus Open vSwitch.
Ben's anecdote about a surprising conference call with Microsoft back in 2011.
Alessandro educates Ben about Ubuntu on Windows
How port names are handled in Windows.
Status of the Hyper-V port and how it's being used
Upcoming features for the Hyper-V port, such as connection tracking bsed firewalling ("conntrack").
OVN on Hyper-V.
Alessandro's role as CEO at Cloudbase and how he gets involved in the technical side of projects.
How Cloudbase attracts employees, via internships from local universities.
Continuous integration for OpenStack on Hyper-V.
Special challenges of projects with a kernel component.

How to Propose an Episode

OVS Orbit is always seeking new content. To propose an episode, please email Ben Pfaff with a little bit about yourself and the proposed topic and perhaps a few aspects of it that are worth exploring.

OVS Orbit interviews aim to be between 30 and 45 minutes long, but some end up being shorter or longer.

Recording episodes in person maximizes audio quality, but it is also possible to record phone calls, etc.

There are many possible interview formats, but in practice most OVS Orbit episodes follow the general outline below.

Host introduction: Host announces the program and names and briefly introduces the guests.
Guest introduction: Guests add some details about themselves.
Topic introduction: Guest or host introduces the high-level topic and adds enough background that people can follow if they don't already know anything about the topic, but do have a basic background in networking in SDN.
Topic followup: What are the use cases for the topic, where did it come from, how did you get involved with it, what's the motivation for it, what keeps you interested in this area, how does the business model work?
Zeroing in on a subtopic: Host and guests discuss some aspect of the overall topic that they (ideally) identified in advance, and go into depth about it. It's nice to identify multiple potential topics just in case the discussion on one turns out to be short. If the subtopic isn't directly related to Open vSwitch or software switching, this segment tries to draw out some relationship.

This is usually the longest part of the podcast.
Future: What's happening next, in the subtopic or larger topic? What are issues for concern? What questions need to be answered?
Other: Is there anything else the interviewee wants to talk about that we missed?
Finding out more: Where should listeners go to find out more, such as a project website. Sometimes, contact information for the speakers, such as a personal website or Twitter or other social media handle.

Episodes

☰ OVS Orbit

Episode 75: Sneak Preview: VMware Research High Bits, with Lalith Suresh from VMware Research (Jul 16, 2021)

Episode 74: The Systems Approach, with Bruce Davie, Larry Peterson, and Mark Twain (Apr 1, 2021)

Episode 73: Computer Networks: A Systems Approach, with Bruce Davie and Larry Peterson (Mar 16, 2021)

Episode 72: The OVSDB Query Optimizer and Key-Value Interface, with Dmitry Yusupov from NVIDIA (Feb 27, 2021)

Episode 71: Network Service Mesh, with Frederick Kautz and Nikolay Nikolaev (Sep 1, 2019)

Episode 70: Long-Term Network Latency, with Nick Buraglio from ESnet (Jul 25, 2019)

Episode 69: User-Configurable Protocol Support for OVS, or Why Doesn't OVS Support P4? (Jun 3, 2019)

Episode 68: The Faucet Controller at SC18, with Brad Cowie and Richard Sanger from University of Waikato (May 1, 2019)

Episode 67: The Discrepancy of the Megaflow Cache in OVS, with Levente Csikor and Gabor Retvari from Budapest University of Technology and Economics (Apr 1, 2019)

Episode 66: OVS Hardware Offload, with Simon Horman from Netronome (Mar 1, 2019)

Episode 65: Encrypting OVN Tunnels with IPsec, with Qiuyu Xiao from UNC-Chapel Hill (Feb 1, 2019)

Episode 64: Introduction to OVSDB, Part 2 (Jan 1, 2019)

Episode 63: Personalized Pseudonyms for Servers in the Cloud, with Qiuyu Xiao from UNC-Chapel Hill (Aug 29, 2018)

Episode 62: Generic Linux Debugging, with Ansis Atteka from VMware (Aug 29, 2018)

Episode 61: Networking with OVS at DigitalOcean, with Matt Layher and Armando Migliaccio from DigitalOcean (Aug 18, 2018)

Episode 60: Oko: Extending Open vSwitch with Stateful Filters, with Paul Chaignon from Orange Labs and Inria Nancy (Aug 17, 2018)

Episode 59: Incremental Processing with ovn-controller, with Han Zhou from eBay (May 23, 2018)

Episode 58: Toward Leaner, Faster ovn-northd, with Leonid Ryzhyk from VMware Research Group (May 22, 2018)

Episode 57: OpenFaaS, with Alex Ellis from VMware (May 17, 2018)

Episode 56: Flow Translation (Apr 29, 2018)

Episode 55: Introduction to OVSDB (Apr 7, 2018)

Episode 54: RCU in Open vSwitch Userspace (Apr 7, 2018)

Episode 53: Ten Years of Open vSwitch Success and Failure (Apr 7, 2018)

Episode 52: Enterprise SDN, with Greg Ferro from Packet Pushers (Mar 1, 2018)

Episode 51: Network Stack as a Service, with Henry Xu from City University of Hong Kong (Feb 15, 2018)

Episode 50: Hardware Acceleration on NICs, with Andy Gospodarek from Broadcom (Jan 31, 2018)

Episode 49: Open Compute Project Networking, with Andrew "Puck" Ruthven from Catalyst IT (Jan 11, 2018)

Episode 48: What's New in OVN 2.8, with Ben Pfaff from VMware (Dec 15, 2017)

Episode 47: Routing a Production Enterprise Network with Faucet, with Brad Cowie from WAND (Nov 30, 2017)

Episode 46: In-band Network Telemetry, with Chang Kim from Barefoot Networks (Nov 16, 2017)

Episode 45: Faucet and OpenFlow at Allied Telesis, with Tony Van der Peet (Oct 31, 2017)

Episode 44: Cocoon-2, with Leonid Ryzhyk from VMware Research (Oct 24, 2017)

Episode 43: Fuzzing Frameworks, with Bhargava Shastry from TU Berlin (Oct 3, 2017)

Episode 42: FlowFuzz, with Nicholas Gray and Thomas Zinner from University of Würzburg (Sep 22, 2017)

Episode 41: DPDK Introduction, with Harry van Haaren and Dave Hunt from Intel (Sep 1, 2017)

Episode 40: OpenStack Cyborg, with Howard (Zhipeng) Huang from Huawei (Aug 20, 2017)

Episode 39: BigBug: Practical Concurrency Analysis for SDN, with Ahmed El-Hassany from ETH Zürich (Aug 1, 2017)

Episode 38: Control and Management Plane for IO Modules, with Fulvio Risso from Politecnico di Torino (Jul 9, 2017)

Episode 37: New Approach to OVN Datapath Performance, with Jun Xiao from CloudNetEngine (Jun 12, 2017)

Episode 36: Baker: Scaling OVN with Kubernetes API Server, with Han Zhou from eBay (Jun 11, 2017)

Episode 35: OVN Support for Multiple Gateways and IPv6, with Russell Bryant and Numan Siddique from Red Hat (Jun 11, 2017)

Episode 34: OpenStack Performance with OVS-DPDK for NFV and Connection Tracking, with Sugesh Chandran and Bhanuprakash Bodireddy from Intel (Jun 11, 2017)

Episode 33: Lightning Talks, with Joe Stringer from VMware and Yusuke Tatsumi from Yahoo! JAPAN (May 20, 2017)

Episode 32: Deploying OVN on Windows with OpenStack and Kubernetes, with Alessandro Pilotti and Alin Balutoiu from Cloudbase Solutions (May 15, 2017)

Episode 31: NetBricks: Taking the V out of NFV, with A. Panda from Berkeley (May 13, 2017)

Episode 30: NEAt: Network Error Auto-Correct, with Bingzhe Liu (May 1, 2017)

Episode 29: DevPulseCon, with Rupa Dachere from VMware (Apr 7, 2017)

Episode 28: OVSDB Configuration for Hardware VTEPs, with Chandra Appanna from Arista and Bruce Davie from VMware (Mar 31, 2017)

Episode 27: DragonFlow, with Omer Anson from Huawei (Mar 16, 2017)

Episode 26: The Evolution of OpenFlow, with Jean Tourrilhes from HPE Labs and Justin Pettit from VMware (Mar 2, 2017)

Episode 25: Trumpet: Timely and Precise Triggers in Data Center Networks, with Minlan Yu from Yale (Feb 16, 2017)

Episode 24: Software Synthesis for Networks, with Nate Foster from Cornell (Jan 29, 2017)

Episode 23: The IO Visor Project, with Brenden Blanco from VMware (Jan 16, 2017)

Episode 22: Benefits of Intent Based Interfaces to IT Infrastructure and Design Problems for Deploying at Scale, with Dave Lenrow from Huawei (Dec 24, 2016)

Episode 21: Container Integration with OVS, with Gurucharan Shetty from VMware (Dec 13, 2016)

Episode 20: Protocol-Independent FIB Architecture, with Ryo Nakamura from University of Tokyo (Nov 29, 2016)

Episode 19: The Faucet SDN Controller, with Josh Bailey from Google and Shivaram Mysore from ONF (Nov 13, 2016)

Episode 18: OVN Launch, with Russell Bryant from Red Hat (Oct 28, 2016)

Episode 17: Debugging OpenStack Problems using a State Graph Approach, with Yong Xiang from Tsinghua University (Oct 13, 2016)

Episode 16: Tunneling and Encapsulation, with Jesse Gross from VMware (Sep 26, 2016)

Episode 15: Lagopus, with Yoshihiro Nakajima from NTT (Sep 10, 2016)

Episode 14: Converging Approaches to Software Switches (Aug 28, 2016)

Episode 13: Time Capsule, with Jia Rao and Kun Suo from University of Texas at Arlington (Aug 20, 2016)

Episode 12: Open vSwitch Joins Linux Foundation (Aug 10, 2016)

Episode 11: P4 on the Edge, with John Fastabend from Intel (Aug 9, 2016)

Episode 10: SoftFlow, with Ethan Jackson from Berkeley (Jul 20, 2016)

Episode 9: Adding P4 to OVS with PISCES, with Muhammad Shahbaz from Princeton (Jun 25, 2016)

Episode 8: Mininet, with Bob Lantz and Brian O'Connor from ON.LAB (Jun 18, 2016)

Episode 7: The OVS Development Process, with Kyle Mestery from IBM (Jun 11, 2016)

Episode 6: sFlow, with Peter Phaal from InMon (Jun 2, 2016)

Episode 5: nlog, with Teemu Koponen from Styra and Yusheng Wang from VMware (May 26, 2016)

Episode 4: Cilium, with Thomas Graf from Cisco (May 21, 2016)

Episode 3: OVS in Production, with Chad Norgan from Rackspace (May 8, 2016)

Episode 2: OPNFV and OVS, with Dave Neary from Red Hat (May 4, 2016)

Episode 1: Porting OVS to Hyper-V, with Alessandro Pilotti from Cloudbase (May 1, 2016)

How to Propose an Episode