Network Automation Nerds Podcast

#052: Innovation in Network Automation with Damien Garros: The OpsMill Solution - Part 1

February 07, 2024 Eric Chou
Network Automation Nerds Podcast
#052: Innovation in Network Automation with Damien Garros: The OpsMill Solution - Part 1
Show Notes Transcript Chapter Markers

In today's episode, we will go on a journey through the  world of network automation with Damien Garros, the brain behind Opsmill.

In part 1 of our conversation, we will discuss his path in networking engineering, including moving from the France to the US with some initial challenges in both the engineering and cultural differences.  He eventually became a voice of change at Juniper Networks and beyond.

The search for a reliable "source of truth" in infrastructure management can feel like a quest for the Holy Grail. In our discussion, Damien and I dissect the limitations of traditional databases and the allure of Git for its version control capabilities as we acknowledge both of their shortcomings.

Join us as we explore the journey of OpsMill, understand its impact on the industry, and learn about the future plans from Damien himself. This episode promises to be a treasure trove of insights for enthusiasts and professionals alike.

Let's dive right in with Damien!


Connect with Damien on LinkedIn: https://www.linkedin.com/in/damiengarros/
Follow Damien on Twitter:
https://twitter.com/damgarros
Check out OpsMill:
https://www.opsmill.com/   
AutoCon0 Damien Presentation Slides:
https://speakerdeck.com/dgarros/network-source-of-truth-and-infrastructure-as-code-revisited-autocon0
Damien GitHub:
https://github.com/dgarros 

--- Stay in Touch with Us —

Subscribe on YouTube:
https://www.youtube.com/c/EricChouNetworkAutomationNerds
Follow Eric on Twitter:
https://twitter.com/ericchou
Network Automation Learning Community:
https://members.networkautomation.community/  

Eric Chou:

Network Automation Nerds Podcast. Hello, welcome to Network Automation Nerds Podcast, the podcast about network automation, network engineering, python and other technology topics. I'm your host, eric Cho. In today's episode, we welcome Damien Garros, ceo and co-founder of Opsmill. From what I could gather, opsmill is a platform designed to make infrastructure more agile, robust and well-documented. Of course, we're asking you more about the details of it in today's episode and today in next week's episode. Our first became aware of Opsmill after the recently concluded Autocon Zero. Many of their colleagues came away impressed and excited about what Damien is trying to build. I reached out to Damien and he graciously agreed to come onto the show, and I am super excited. Let's dive right in. Welcome, damien. How are you?

Damien Garros:

Hey, thanks. Hi, eric, I'm very happy to be here and, yes, for more of what we're working on. Yeah, thank you for the intro.

Eric Chou:

No, the pleasure is all mine. Like I said, there's a lot of people, in particular Kirk Byers like he's the creator of NetMeco for people who's not familiar Well-respected, one of the most widely used tools. He came away and said hey, this is the product that I'm most excited about. I hope I didn't misunderstood his tweet, but if I did, it is what it is. I mean, I'm sticking to my story anyway, so it's great to have you here.

Damien Garros:

Yeah, thank you. Thank you, kirk. And a quick fun story about Autocon, actually one part I really loved. At the end I think they had a panel and Kirk was on it. Sure, everyone that would actually come to the mic say thank you Kirk, because for all the work that he does for community, Right.

Eric Chou:

Everybody was like, hey, can I kiss the ring?

Damien Garros:

Yeah, that was very cool. I really the thing was very nice because he's doing a lot of stuff and so he deserves a lot of credit for it.

Eric Chou:

Oh, definitely Right. Like and I mentioned multiple times, I've taken Kirk's class, like the free one, and then I took the paid one. You know, I met him up when I was in San Francisco so I reached out to him constantly, so I mean he's just a nice dude and he's doing a lot of good stuff. So if I were like attending Autocon unfortunately I couldn't make it this year but I would be like, hey, kirk, you know I'll be like a fanboy moment too, definitely. Yeah, all right, cool. So, damien, I mean, you know, let's get back to the origin story, right? So I mean all the guests I'm always interested in how they get into networking, how they get into technology, where you, like a child prodigy, you'll write your first game when you were five and sold it for billions or what I'm not no actually a different story.

Damien Garros:

When I got my master I was barely speaking English I almost didn't really because I couldn't speak English and I think the first class, you know, they thought I was like desperate because I couldn't write code and I didn't understand how IP address sub-bending was working. So I definitely didn't do all of that. When I was eight years old, I started from. Actually, I started as a pre-sells, you know, first working for a bar. Okay, I made my way into Juniper and then that's really when I started being a Juniper that I really started understanding like the core of automation. Yeah, but the time I was working with a very big customer in France and I was doing a lot of network design for them like that was part of the SCEWO, like we were always designing same time as the project and very quickly I realized that you know, there's genus had all of those features at the time and I could use those features to improve the design Right, and I started making a lot of sense to me like, hey, you know, I actually, like you know, as a network architect, have all of those features that I can use to make the design more efficient than the networks and all that. So I started working on that. I remember I think my very first features was they had a. They were missing something in a product to monitor some of the internals and the customers really wanted that. It ended up hacking something in. I think the name was Slack's. At the time it was like the pre-Python world in Juniper and you know that actually helped us win the deal. And so from there I pretty much you know, at some point through Juniper, managed to get into the US. So in 2013, I moved into Juniper Engineering. I think we should still invade that at the time so I didn't want to be customer facing, so I ended up joining like the SIS testing and my role was actually to write an automation platforms to test some of their biggest switching products.

Eric Chou:

Yeah, and so that was very. Is it the EX line, or was it it?

Damien Garros:

was the QFX line. So my job was to write an automation suite. They will let us test the biggest Q-fabric. They had like 6,000 Tengi ports. We had, I think, about a hundred Spyron ports. That was the beats. That was just exciting and so I built some automation for them. And then very quickly I moved into like more technical evangelists for the automation stack and marketing and then after that I joined AppStra. We learned a lot of AppStra and then I moved to Roblox where I got a chance to be the first member of the network automation team so did an amazing, amazing project there and then I really liked to basically take everything we did at Roblox and make it available for more.

Eric Chou:

Company Right Are you talking about Roblox, like the gaming? Yes, oh, okay, wow, interesting. I didn't know that. It seems very different than Juniper.

Damien Garros:

Yeah, very different. But when you're TME, I think at some point you're talking a lot about the technology but you're always doing demo. Sometimes you get a bit frustrated that you're not actually doing it for real and you feel like you know a sense of it's missing. So I really wanted at some point to go on the other side and actually do it on the customer. So I was very happy and again, roblox was so lucky, like I joined just before at the time when they started rebuilding everything. So in 18 months we deployed 12 pub worldwide, we rebuilt 2 data centers, migrated and all that. It was just a crazy, crazy, crazy time, but super exciting.

Eric Chou:

Yeah, I mean, I have two kids, right, so I am familiar with all the block coding, maybe like the Raspberry Pi stuff, roblox certainly. So it's kind of interesting to hear your thoughts on just kind of the back ends of it, right, so a lot of startups choose to start with the cloud, but it seems like Roblox kind of want to own that stack.

Damien Garros:

Yeah, because they actually run the games for all the players. So, and also there's a lot of traffic.

Eric Chou:

So at the time it was making sense and also like the.

Damien Garros:

initially that was interesting, but initially the game engine was actually all in Windows. Oh no, but when I was there, someone actually managed to migrate everything to Linux and right and all that. But initially there was a lot of that. So I think they were just. They had better performance on their whole auger.

Eric Chou:

So yeah, I mean I've seen the, you know, obviously because I worked at the design in Azure, so so I've seen a lot of companies started there. But there's also many companies choose to whether by regulation or whether by, you know, security choose to have their own own stack on prem. And also I've seen companies such as box. You know they started out online on the cloud and then I'm sorry Dropbox and then migrated to their own because at some point it's actually cheaper to run your own than to do a hosting.

Damien Garros:

Actually, a big part of the team at Roblox came from Dropbox, because Roblox was actually one of the members of the team at Dropbox that did that migration?

Eric Chou:

Is that migration? Oh wow, yeah, very small, very small work.

Damien Garros:

But that was very interesting time. But it felt like all the cool things we built were really custom made a lot for us, right, and in in sense, I wanted, you know, to take that to the rest of the industry. I wanted to try to see if we could build that like more of the shelf solution that more people could use. So that's at the time I had the opportunity to join the talk to code and then then I got exposed to a lot of different customers. My role was really like partner with our customers, help them define their automation strategy and then help them implement that strategy, define the architectures, work with our team, write the code and all that. So that was very exciting. They gave me the opportunity to try a lot of different things and you know it was working a lot at the time, Like we were all about the source of truth. So we're doing a lot of projects with Netbox and then not about we put a lot of ideas on. You know, we were always trying to see how we could take those platforms further and further and further. Right, and that's how we came in with NonoBot and I think at some point it always felt like all of my customers the customer the conversations were handed out being hey, how can I have the best of infrastructure as code and the best of NonoBot or Netbox? Which means, like they wanted a UI, they wanted to have an API, they wanted to have all of that, but they also wanted the flexibility of Git and version control. And you know, that's probably the summary of my last five years of my life. How do you do that? Yeah, trying for five years it felt like we might not be able to succeed. You know, with the current solution and I, like you, know strongly that we had to maybe start from Start from scratch, you know, and redesign exactly the right components so that we can go into Into more, more details on that. But that's kind of the genesis of you know upscale and and why we ended up like building that.

Eric Chou:

Yeah, definitely. I mean, first I want to just kind of thank you for have even having the mindset of trying to share something that's, you know, proprietary or somewhat closed doors to for the, for the community, right Like, not everybody think that way I won't name any names, you know it's definitely. You know, a lot of times there's a hoarding mentality. This is our competitive advantage, right Like this is where we make the money and we can't, we can't, give our secret sauce away. But first of all, you know, I want to thank you and I think the community benefits from it of having this, you know, kind of a Linux and GNU mindset of you know trying to share with the community. And of course, you know, let's talk about it, right Like. So I put up for our listeners, I mean, if you're listening, that's great. We'll have the link in the show notes. But for those who wants to, or have the capacity to look at the YouTube channel, we actually have Damien's Presentation slides and he's gonna drive a little bit on. Just talk to us through the, the start of the, the op smell, and also you know what problem they're trying to solve. So, so let's start with, you know, because that was it. That's where you're kind of left off, right Like you're, you think you're going back to the drawing board. So what problem is op smell trying to solve?

Damien Garros:

So I think it's we know we can have already started on it. It's like to automate you everything that have seen that have been really successful. Automation had the old Platform that has many components and usually in the center you have the source of truth, right and by source of truth.

Eric Chou:

What do you? What do you mean right?

Damien Garros:

I mean, it's like who's not familiar with where you store the expected state of your infrastructure? That's what I define my source of truth. By the way, if I can't say I hate this term, source of truth, I use it all the time. I feel like it's too generic and you know Every person has a different opinion of it, so right what do you mean?

Eric Chou:

like a special?

Damien Garros:

We find, we move past source of truth and we find a better name for it. I have few proposal, but you have a data sets in the middle and this data set is captured in ten instead of the network, and then that's what drives your nation and all the other components Connect into that source of truth. One which we're seeing is today most people actually have been reinventing that components on their own, and Actually the most successful organization I've seen they've been building their own database right, which is like what we're seeing, like this read to category of solutions like either people put all of their data in the database and I think the most successful they are using their own database. Now there's a big part of the industry they also use get at their source of truth. Mm-hmm, the story we're seeing in get. So it's really this trend that we call infrastructure as code and of course, in the database you have the, the net box and the not about. You know that, but what we found is that the really important features of the source of truth Is you need to have a very flexible schema and you need to have version control artist. You need to be able to control how the data gets in, because this decided that everything that gets in gets the boy right, and so it's super important and, and I think, actually the the closest tool that has those two capabilities probably get, because in many way gets is Super flexible for the schema, because it doesn't really have a schema. Right, man, it's super, it has the version control. Now, there's a lot of Downside to that, but so that's what we're, what we're seeing, and and again, our analysis has been that why do we have two different system is because he begs for the. If we look at them and I pass some of those things that they they have different capabilities and if you look at Again, that's after, you know, going to the whiteboards and all that's like. If you look at what an engineering team needs to do on a day today, they need to manage standard changes, like having VLANs, having new services, you know, just adding to their infrastructure, right, they also have to manage migrations and you know design changes and like changes that are a lot more complicated and defined in many way, right, what we're seeing is pretty much to manage those Different changes, you need different solution, like when you're actually managing a day to day change. Those are very well understood. You can easily Manage that. You need to have like an API, you need to have a forum, you need to have all those things which fits perfectly with the database approach. Yeah, easy to deploy those, those change that you understand if you can codify them when you have a database at the center. Because you have the API, you have the right activity and it's it's easy.

Eric Chou:

Yeah, very structured, right and very well understood. So you, like you said, if it's understood you could codify them and then you could kind of Repeat and rinse.

Damien Garros:

Yeah, but all those heart changes, those, those Non-standard changes, like when you do design change, those are much more complicated because you want peer review, you want an isolated environment, you want to. Potentially you will take multiple days to prepare your change, right, those are really hard to do in a database first approach in a database system, because it's you cannot just have a staging environment of your database, right, you have to prepare everything outside, but you never have to full visibility and so and that's again, that's exactly what get provides. Get is so good at providing an isolated environments. You can do peer review, like complex changes are super easy to manage with git. But the other side it gets. You know you don't have a query engine and you don't have a schema. Schema, the UI is there's no like user interface, you don't have API. So there's all of those things and Again, the analysis has been pretty much like we, we have two main change that we need to manage and we build a solution for each of them. But the issue is we need those like I don't know a single environment that doesn't need to provide Both type of changes. I think that's where you know we have all of those struggle today is like, if you are in a situation when you go over organizations to automate the, the standard changes, then you will go for database first. If you're looking for the other one but there's no solution to that, I can provide those. And that was really when we realized that's okay. You know how we went back and then. So we need to have version control, we need to have a very extensive scheme, I need to have the, the query engines and all of those things, and that's why we started the Upsmote pretty much.

Eric Chou:

So I mean, let's double click on that git first, a little bit right. So we have that graph on what you talked about, the rigid, you know, changes that you have the DB first and then you have the, the git first. So why can't we have, you know, like git first but also have a? I mean, when you talk about you don't have the schema, the rigability, it's because git is so flexible, right, allows you to do anything as long as you know, kind of text-based, but why can't we have like guard rails around it so that we Protect the, the DB schema, and have a user in parallel? I mean, is that, have you? Are you aware of any solution on the market that does that? Or you're just saying that is why we're doing it?

Damien Garros:

Now. So there's actually multiple solution now that are coming up like this idea of bringing Version control and change control into a database. It's actually going and there's multiple solutions out there, okay. So, for example, there's Dalt, you know, and I think, to code. We started exploring how we could take this new database, which is pretty exciting. It's like a my SQL database, complete my SQL compatible, and he has all the construct of git internally so you can actually create branches in this database and all of that. So, oh, wow, exciting, and we started looking at how we could integrate that for us Not about right, the? So there's, there's a lot of that now, why we didn't go with with those databases that I Think the challenge for us is, you remember, is we have to manage to do the version control, but we also have to have this extensible schema, okay, which is the relational database. Is that that's the opposite of a flexible schema like a Relational database. By definition, the schema is is very scannable in the kernel space, very low level, and then changing the schema is a really big operation and you don't have a lot of, there's a lot of constraint to that you know and also like, because everything is stored as stable. So it really feels like If the main goal if you accept that the main goal is having a very flexible schema, we should not use a graph database, a relational database. That's never gonna work because it's kind of the opposite of what our relational database, relational database are. You know, you define the schema at the beginning and then have the super perform system that will let you consume that. But it's not about extensibility, no, no, and that's kind of when I I realized that and I realized that you know Django and networks and none of us were really built around this relational database and that it will never go away Right, not in those technology stack, that I started looking elsewhere and and maybe it's worth stepping back, and maybe it's worth stepping back a bit, but maybe we'll come back to that. For me, actually, when I told you after Juniper I went to AppStrah, one thing that I learned and I was in For people who's not familiar with abstracting, just tell us what AppStrah does. Yeah, it's the solution that's really focused on managing the entire lifecycle of data center fabrics it's from the design to the deployment and the monitoring and so it was really built initially to work on any type of hardware. Yeah, it will build the configurations and do everything for you.

Eric Chou:

Yeah, for my basic understanding is AppStrah is treating your whole network as a whole. It doesn't really care what components are in it, it cares about what you want to do with it and how many more intent-based instructions on. I just want 64 times 10 gig link in this layer and I don't care how you do it, is that?

Damien Garros:

true, yeah, that's very true. So it's really I see that as a vertical solution. Yeah, it really addresses the data center markets and segments and provide very customized and optimized solution, again from the design to the deployment, to the operations and monitoring for those solutions. Right, okay, and one thing they had at AppStrah is they were using in their core a graph database. So it's actually a different type of database than the relational database. I think had a side of me go to that, but what's interesting is it's this type of database.

Eric Chou:

I'm so excited you have visuals. Yeah, because it will be really hard. I mean I might apologize if you're like driving or something and you're just trying to wrap your brain around it, but it's very nice visual, by the way, having this relational versus graph database.

Damien Garros:

But go ahead so honestly, the first time I got exposed I was just off. Actually, I didn't knew they were making that change. And I remember when they were like, hey, we changed everything and then suddenly we didn't have REST API, we didn't have all of that. We had this new interface and that was a shock. But very quickly I started working on it and digging and then I realized how powerful those database are, because the actually it's funny because relational database have this name relational in the name, but it's actually graph database are really designed for relational data and you have much more flexibility on how you connect things. You don't have the constraints of if you had an attribute or relationship you have to apply to all of them. It's much, much easier to extend the schema in this environment. And super powerful. So, yeah, I was exposed to that and so it's pretty much my abstract. It was always in my back of my mind. It's like that's the type of technology we need to use, that's the type of technology we should use, except at the time the ecosystem was not using any of that. And so when, again, when I really finally gave up on relational database for solving the network source of truth problem that's pretty much from my mind point and after studying that for many years and looking at how we could do that, I found that there's actually techniques of how you can store the data in a graph database that makes it what they call is a temporal graph. So it's a way where you never delete any data and you have ability to travel in time in all of your data and go back to past versions and all of that. So it was very interesting concept.

Eric Chou:

It sounds like magic, it sounds like Dr Strange.

Damien Garros:

So now it's very, very, there's a lot of paper on it and but pretty much what we did is okay. It was like okay, so we like this idea of time temporal data immutable database. And what we did is like, okay, could we actually extend that in multiple dimension so that we just have one temporal database, but we also could have multiple branches of that database. And that's pretty much what we had at the building and that was really the genesis of, you know, upscale and specific infrastructure, which is the solution that we're building.

Eric Chou:

So can you? I mean, since we have the graph relational versus graph right Can you give a little bit of example, because I think this is where people will have question about or, like our listening, will have question about them Like many like me are familiar with the relational database and the rigid structure around it. So, for example, for something like not about or for a net box, you have to define like a region and then you have to define, like you know, physical locations the East Coast, west US, east Coast US, west Coast US, you know North European and South European, whatever, and then you have that relation into that region and then you have your devices within the data center within that region. So these relations don't change right, and it's perfect for relational database. But can you give us some example on the relationship that is not suitable for relational database and therefore a graph database would be perfect for it?

Damien Garros:

Yeah, I hope you guys have a great afternoon and actually that's maybe a parallel to like what we're trying to do with Netbox and not about I'm speaking in this presentation is can step back a little bit. I come back to your question. Sure, like, I think in the last you know, five to eight years, we really embracing this site or started, you know, embracing the idea that you know we need to put more data into the same schema or in the same place so that we can connect and capture the relationship between those different data. I really think when Netbox came out, that was the first time, at least for a lot of people we had IPAN data with DCIM circuit information in a single place, you know, with nice API, and it was easy to consume. And so we saw the power of having a lot of different data from a lot of data set that used to be separate, yeah, in the same database so that we can make query. And I think over time, when Netbox started having plugins, then we were able to add more and then we started to extend the type of data and for me that was really like as opener of, like, the more we put data, the more we can capture this relationship, the easier it is to actually manage this long-ghost amount of data and at the result, like it's, we have better data. We, when we make a change, we can understand the impact of those change and we don't have to have duplicate data sets. That gets out of that and all that and so. And there was also this idea that there's all the technical data in there, but we were seeing customers. They wanted to have different information, like design information, like they wanted potentially to have customer information. Oh, I see what you're saying, okay, we mentioned now that you wanna represent a customer and you wanna create a new table in your system that's represent your customers, and now you wanna capture the relationship between your customers and every single technical elements or resource that it uses, whether it's IP address, vlan, interface. So do you understand like, technically, you should build them for that? That's the kind of relationship you might wanna create in your database. But in a relational database, if someone provided this initial schema for you about the IP address and all that, you actually may not be able to create that relationship because that would require, for example, if it's a one-to-one relationship, that all requires to change the dimension of the initial table, and that's something you could not do. So that was something we were always fighting at Network2Good is. The system was coming with a default schema and then we wanted to provide extensibility, but there was a lot of things that were not possible. So we were providing like I actually developed this myself this features in all about called the custom relationship, and we did it in a way at the time that we were using like, because we couldn't extend the table of the existing objects, we had to do it in a less-performance way, and then we were trying to do our best to make it super easy to consume from customers, but the truth is that there was a lot of very technical things happening in the background and I think we always had like a performance hit because those relationships were not as performance as the other one. That was just super complicated to manage all that.

Eric Chou:

So yeah it almost seems like sorry, kajoff, go ahead.

Damien Garros:

No, I'll fully that answer your question.

Eric Chou:

Yeah, I think that did kind of get a glimpse into the starting point of why you face this problem right and just like the real world there's some relationships that are very straightforward, one-to-one, like the circuit has one circuit ID, right, like so that's easy to come up with that mapping of the relationship. But there's also very complicated relationship that are very hard to grasp and dynamic, right. So if I give my own example of a family, I mean I have relatives who were like divorced ones and for both the husband and wife, and then they get remarried and they have kids again and so those are like kind of dynamic and your kids and my kids and our kids and like the family tree becomes very commingled. And that seems to be a shortcoming of a relation database when it's dynamic and hard to map, like the data line almost.

Damien Garros:

Yeah, that's a good and so one thing I realized at some point is you know this idea of putting a lot of different data in the same database and in the single schema so that we can make very advanced query to understand and we basically extract more value from those data. It is actually not something that's specific to our industry. It is actually something that exists also in other industry and there's even a market for it. Like people call that, knowledge grabs. They found that in retail, in pharma, in cybersecurity in a lot of different places, people have been doing exactly the same. They realized that we need to have a super flexible database where we can have a very flexible schema and then we put all of those data in central place, we create those relationships and if we have a very powerful query engine, then we can do more with our data.

Eric Chou:

But why do you even have a schema at that point? Right, so you know that reminds me of the non-structural you know, a schema less or a no SQL database that you just put whatever junk you have in there.

Damien Garros:

Yeah, the schema helps with the integrity and the schema helps with the query engine, sure, and it helps you actually know, to know what you have. So and so, yeah, I think it's super important and pretty much, you know, when I started realizing that, okay, so what we're trying to solve is also happening in other industry, right, then you know, maybe we should pay more attention and understand them. Maybe and when I started looking I realized that all of them are to also using graph database they all came to the same conclusion that to solve these specific problems, you need to have a graph database. I think, from what I was, that was probably one of the confirmation. Like, I already had this idea that, you know, we should move to a graph database, but then when I looked and said, okay, they're doing very much what we're doing, they also have a graph database, you know, then that was a confirmation that that's what we need to go. And so, yeah, that's. I really wish you know we would actually pay more attention sometimes to what's happening to other industry, this thing there's a lot to learn from other industry and what they're doing, because our problems are specific to us but they're not that unique in a many way.

Eric Chou:

I so want to double click on that, but this is probably a good point to wrap up. If this is interesting to you, make sure you stay tuned for the next episode, where we're dive more deep into it. So thank you for being on part one of the recording, damian, and we'll resume our conversation for the next session. Is that cool, yeah?

Damien Garros:

thank you very much.

Eric Chou:

All right, cool. Thanks for listening to Network Animation or its podcast today. Find us on Apple Podcasts, Google Podcasts, Spotify and all the other podcast platforms. Until next time, bye-bye. This wraps up the talk of the town of.

Network Automation Nerds Podcast
Exploring Infrastructure Automation and Database Solutions
Graph Databases in Network Infrastructure
Embracing Graph Databases for Data Relationships