
Rethinking the Network Engineer – by Victor Kuarsingh
As a passionate network technologist and leader, I continuously think about what it takes to build networks for today and for the future. I started my career at a time when networks were either Enterprise focused – hidden and secured from this new thing called the Internet – or they were provider networks directly attached or connected to the Internet. Enterprise networks would be enabled with, if Internet access was required, firewalls that provided flow based security and network address translation (NAT) capabilities.
Although both types of networks may have been connected to the Internet, enterprise networks and provider networks were built (often) very differently. Historically, the Internet was something that was primarily supporting early content sharing, email, news, and basic business functions towards the modern Internet. Today, at least in western culture, the Internet as we now know it is a foundational part of how people work, shop, socialize, research, learn, and generally interact.
In this post, I dive into the evolution of capabilities needed to be a successful Network Engineer. I concentrate more on the provider network side of the equation, however, engineers from the enterprise side can benefit from this discussion.
In the past
Over the years, “the network engineer” was a vital part of how the Internet came be be when thinking of the practical deployment of early networks. Many people contributed to facilitating the growth of the Internet, however Network Engineers working in provider networks, transit provider networks, content provider networks, at hosting providers, Internet Exchange points (IXPs) were instrumental in building the infrastructure that allowed packets to flow.
In the early days, I know of many network engineers who “just fell into it”, picking things up and learning as they went along. Some early Network Engineers may not have had a deep background in computing, Internet protocols, communications, or programming, but were able to figure out how things worked and supplied tremendous value to building our early networks. Determination and a desire to learn was a major advantage to those early entrants into the Networking Engineering discipline.
We had also witnessed the rise of the vendor sponsored learning and credential assignment, as the industry sought promote and financially reward people who were inclined to study and learn pre-defined skills for network design and operations. The vendors driving these programs helped many to join the ever expanding network engineering and network management workforce.
Networks in the early days were often much simpler than they are today and it was quite possible to have a person (or persons) who knew the details of how an entire network system worked. These persons could help build, provision and troubleshoot entire networks using early user driven methods using the infamous CLI. However, life has changed, and so has the demands on modern and future Network Engineers.
The stacking of network knowledge
I have been known by those around me to refer to a hierarchy of networking knowledge. I call it the ‘what‘, ‘how‘ and ‘why‘. Seems quite simple, but there is more than just nuance to what these thresholds of capability are, and how that drives what I think the demands for networking engineering are becoming.
I call the ‘what‘ of networking the skills to basically build syntax and get networks up and running, along with fixing them when broken. The ‘what’ of the trade is often covered by a vast majority of engineers and operators who know the basics and can operate well on a command line (syntax gurus). They are familiar with networking protocols, the basics of how protocols interact with the network and can rangle the console and have a firm command of networking environment syntax. Many network engineers, and most network operators I have known in my past fall into this first threshold. Let’s call this level one capabilities. If we were talking about something like OSPF, the ‘what’ person would know that it’s an IGP (Interior Gateway Protocol), know that there are areas, how to configure them, how the areas work, and the various route types and functions of OSPF. This is all you need for level one – the ‘what’ person.
I call the ‘how‘ the next level of skills around networking. It goes beyond knowing how to configure things and is the skill and knowledge of how things really work at their more fundamental layer. If focusing on our OSPF example, the ‘how’ person will know how OSPF works underneath. The ‘how‘ person understands the link state database, how Dijkstra works, shortest path trees, and often how vendors have implemented the protocol into their platforms. Beyond this, the ‘how‘ person will be able to drive network designs and understand how to best utilize and/or choose options within given functional models. This type of of person, with this ‘how‘ knowledge, comprises most of the remaining network engineer and network operations personnel. A person with ‘how‘ knowledge could design networks with enough scope and guidance, and was often a master troubleshooter. Within the emerging networks, I think this level two state of knowledge is the minimum level for the future network engineer (more on that later).
The ‘why’ person is far more advanced than the level one (what) and level two (how) person. This individual understands, not just how something works in detail, but also knows why things are they way they are. These individuals often can contribute in much deeper ways including working on future standards, modifying protocols, building new ones, understanding how to solve business issues with technology among many others deeper skills. A ‘why’ person will be able to also look beyond today’s immediate demands, and take into account future potential demands, and assess intersectionality with other areas such as applications and services that operate in cooperation with the network. At this level, the person is someone who has, and will solve more complex issues and help with any paradigm shifts in how we build and operate networks. These ‘why‘ persons are the elite and will be needed as we continue to move towards the future.
If one were only a ‘how’ person, they may lack the insight to really advance environments to places they need to go for the future. If one were only a ‘what‘ person, many of the skills needed to design modern and future networks are (or soon will be) beyond their capabilities. For those in the ‘what‘ roles today, we are seeing automation, monitoring, tooling and computational analysis replace functions managed by the ‘what‘ persons.
Modern and emerging networks
A great deal has changed since those early days of the Internet and the networks that were strung together to make it viable. We have seen an abundance of technologies introduced to help improve network function, visualize paths, stack services and build robustness into the system. Beyond that, the complexity of the networks that power the Internet have increased in many places – notably in the provider environment where previously independent services are now collapsed onto common networks. Networks got bigger, more complex and we begun to depend on them to a much greater extent and for a wider set of both business and personal needs. Being offline 20+ years ago was a nuisance, however, offline in today’s world can crumble a business and render many workers immediately unproductive.
Today, some networks are incredibly big, so big that if you attempted to build (configure) them by hand, it would take months or even years to complete by applying traditional techniques. Other networks are highly complex as they stack many logical networks onto a single physical network with varying logical topologies, forwarding rules, service requirements and expectations.
In Service Provider networks one may see Internet services, access services, Layer-2 services, private Layer-3 services, video services, voice services and more – all live on a common backbone that needs to scale and function at a level of quality not demanded 15-20 years ago. All of this, while trying to build, expand and repair these networks at a faster rate. Cloud networks are no easier and have a host of other unique challenges given their dense scale. Here we see massively scaled data centers that need to accommodate a significant amount of hosted customers with a wide array of demands, workloads and a continuous need to drive more functionality into the service stack. No only is it hard to design and deploy in these networks, but operational demands are much higher.
It may be true that some of the best network engineers have historically had the capacity to ‘see’ an entire complex and/or massively scaled network system in an abstract way; however, it’s had become increasingly harder to effectively troubleshoot and/or analyze such networks by hand (e.g. via the CLI) – even for the top/elite engineers. To keep up with the demands of new builds and to ensure systems are fixed within a reasonable amount of time and to verify overall service health – network automation, telemetry and tooled systems analysis is needed.
Modern and future needs
What does it take to build and manage a modern network? First off, it takes a different way of thinking (remember the ‘how‘ and ‘why’ persons). When one builds a network through legacy techniques, you build and configure it in a way that is ‘CLI centric‘ as I call it. The implementor or administrator types commands into a console window, and if you have a bit of extra skill, you can script the manual lines of syntax for faster execution. You may also have a few drawings of what you are building, and it takes a long time to get the system configured since many of the steps are manual. Legacy configuration is often structured in a way to make it easier to apply the configuration and to later troubleshoot the system through direct human interaction. One may desire, in a legacy world, a minimalist configuration to help the administrator when viewing it later. This style of legacy optimization allows one to follow the more complex parts of the syntactical configuration when looking at it (e.g. such as routing policy, filtering rules and so forth).
Unfortunately, this manual style of interaction will eventually fully lose its’ viability. If you have 1000s of devices to configure in less time than you had before (to configure say 10s of devices), logging in to apply configuration no longer makes technical or business sense. Outside of just the deployment time consideration; in operational uses cases one may need to apply changes widely to 100s or 1000s of devices in a very short period of time. This need demonstrates that it’s not advisable or even probable to realize success via the CLI in any reasonable timeframe. Above that, with so many data points to analyze during complex outages, it may not be humanly possible to assess all the data in a timeframe that meeds the expectations of the business and/or customer. So what do we need? We need automation, telemetry, tooling and offline analysis.
When troubleshooting, especially highly complex systems, one needs to verify many aspects of the performance of the system quickly and to verify many components of the network state. Attempting to conduct that using a CLI or console is not effective, and I would counter – downright impossible to do effectively and to any level of quality. Tools are needed to export relevant data to be viewed externally which can often graphically show the state of network conditions and report on the network state parameters. However, even offline verification with these tooled capabilities is beginning to be untenable in a some places – moving towards automated verification, analysis and decision making.
What is the future network engineer?
The type of person needed to design, build and troubleshoot such networks is changing. This change will take a few years since many in our industry today need to retool their skill set and others need to be fundamentally re-educated. Out of pure necessity, a rotation of new blood will roll into the industry and many current engineers may choose to find different roles inside their organizations. So what do we need?
Software and logic is on the rise as a foundational element for networks. This has become the primary tool to accomplish many of the needs we have in modern networks. It’s needed to build automation systems, deployment tools, analysis tools, verification tools and more. Understanding software, how it’s built, how to use it for networking purposes is essential. We need design systems with the intention that software and logic is used to deploy, run and troubleshoot elements. This will be essential for success and competitiveness in the future.
Not all Network Engineers need to be developers, but software structure, capabilities and ensuring logic can be applied in places where human interaction and decision making once was, is a critical part of the future network engineer’s skill set. Automation, for example, changes the way one even designs a network. The application of automation changes basic design patterns, driving designers to no longer anchor syntax and design structure to prioritize visual simplicity into the system. For example, a device configuration should be constructed, such that software logic can be used to build, adjust and remove components of the structure. Digging in further, it no longer matters if a complex policy fits into only a couple hundred lines of syntax (so one can read it), but it needs to be built to support offline analysis to identify validity and/or be adjusted. This is a major shift from years ago. We don’t build it to be read in realtime; we build it to be parsed by code and apply quality to the equation during templetization and design cycles.
Unfortunately, many of these skills are not just picked up “on the fly”, and need to be taught through education re-training with the right aptitude. Given these skills are different than those needed previously, the same aptitude needed years ago by network engineers who came into the industry, may no longer match what is needed now. Don’t get me wrong, I know many incredibly intelligent and capable network engineers who will do just fine as they already have what it takes to move forward. That said there is a large number of network focused folks whom I think will need to retool and re-educate themselves for the future. I am truly concerned that the skills and capabilities which were once lucrative (in networking) based on the previous “on-the-job” training will soon fade into commodity roles and/or be replaced by software and automation.
I am very hesitant to provide specific recommendations for large swaths of people, however, if I had to make note of key skills sets that all Network Engineers should seek to gain (should they not already have them), It would include those in the following paragraphs.
First, software structure and ability to code. Understanding how software is built and can be used for the purposes of networking is very important. It’s more important for the designer, but I would also say for the operator as well. Many suggest that network engineers learn ‘python’ as an example – I agree. There are many options for high level programming languages, however, if you need to learn one high level language, it’s likely the best one right now. I would extend that by suggesting folks take courses (class, offline, etc) on programming and software structure. At least the fundamentals of this area of expertise will be valuable for all Network Engineers. The side benefit of this is those engineers can also better understand how network software (on common vendor platforms) works as well depending on how much they study and how deep they go. Far too often, I find engineers who cannot grasp why certain condition exist in production, many times a lack of understanding of how software functions is the culprit. This lack of understanding also drives those same engineers to not understand what it takes to change software, but I digress.
Second, math for the designer roles. I know folks shy away from math many times, however analysis, modeling and building complex systems means you need to apply mathematical models to a problem. Guessing or eye balling the problem is not a good way of approaching complex problems where specific solutions and efficiently may be needed.
Third, engineering skills and understanding of hardware. I would say this area has always been needed, however, it is becoming a serious need and area for improvement. For years, the use of big metal vendor appliances, firewalls and routers have allowed many network engineers to get away with not really understanding the depth of how hardware and software interact, and how these then impact networks. Many big metal routers hid the intricacies and complexity of what it takes to move traffic effectively and dealt with complexity inside these platforms – at a high cost. However, with many modern network designs, and the increased use of lower cost merchant based platforms, the need to understand hardware is especially important as we replace big metal with fabrics and more unique network designs. Constrained capabilities on merchant based platforms means the designers need to really understand the details how how packet processing works at its most fundamental layer. Additionally, those modern designs blow out the topologies and functions into discrete elements, once hidden by high cost big metal. Designing in this new world means you need to know hardware, you need to know how functions are implemented in code – if you want to be a successful Network Engineer.
Forth, drive. I think this will be as important as the first two. In any changing environment, a desire to be successful and lean what is needed at all costs is an attribute that is needed. I have seen too many Network Engineers fight change, only to eventually lose the battle and left in difficult positions. One need not look far into the past to find examples of people who fought technology changes – just to watch it change anyway. All you can do to fight change is delay it. Necessity and economics will dictate things must change. So you may as well embrace change now and get on with it. Hiding information from peers, ignorance, or manipulating organizations to prevent progress and ‘insistence that changes to designs/deployments brings uncertainty‘ will only end up in failure at some point – at which point you are now behind all your peers and are forced to catch up.
Networks have been, and continue to get more complex. Infrastructure is also essential to how we run as a society (education, business, government and personal lives). The skills needed to design and automate these networks are changing. If one wants to keep working in this industry 10 years from now in a role that is fulfilling, it’s time for network engineers to think deeply on who they want to be, what skills are needed to contribute to our future.