9

What It’s Like to Work as an Infrastructure Software Engineer (at Flexport)

 3 years ago
source link: https://flexport.engineering/what-its-like-to-work-as-an-infrastructure-software-engineer-at-flexport-252267999fd9
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

What It’s Like to Work as an Infrastructure Software Engineer (at Flexport)

Image for post
Image for post
Server rooms: Flexport is hosted in one of these!

For the past 4 months, I’ve been working as a software engineering intern on the Backend Infrastructure team at Flexport. As someone who previously considered themself as a full-stack developer, I can say there are a lot of differences, but working on the infrastructure engineering side has been an enjoyable and satisfying experience.

To be honest, I was initially a little hesitant going into my internship, because it wasn’t too clear to me what an infrastructure software engineer did. There wasn’t much information about the role online, and if there was, it was pretty vague orjustinaccurate.

Image for post
Image for post
Wrong kind of “infrastructure engineer”..

I wrote this post primarily to share my thoughts on what an infrastructure engineer is, and my experience working as one. Hopefully it’s educational, and helps you consider (or not consider) it as a future career option :)

What is an Infrastructure Engineer?

The specific responsibilities of an infrastructure engineer vary depending on the company, but generally, it refers to someone who develops and maintains tools and frameworks that other teams use to smoothly develop their applications. Note that these tools/frameworks are usually domain agnostic. An infrastructure engineer can work on:

  • CI/CD pipelines — A continuous integration and continuous delivery system is an important tool that allows engineers to efficiently work together on a software project. It automates some of the important steps, such as running unit tests, building the image, and deploying the app, so that engineers can focus on writing code. Some popular tools for CI/CD include Travis CI, Circle CI, Buildkite and Jenkins.
  • Cloud Infrastructure — A lot of companies, nowadays, host their infrastructure in the cloud. It’s a lot cheaper and easier to make infrastructure changes, because there’s no need to worry about expensive, physical servers. Popular platforms like AWS, Azure and GCP also have a lot of built-in tools and features that make infrastructure development much simpler.
  • Logs and Metrics — Logs and metrics are critical for monitoring applications and debugging issues. Logs can provide visibility into interesting events that happen within an application. Metrics can be used to create time-series visualizations, helping you see when things go wrong at a glance. Regardless, infrastructure must be set up to get this data from production and staging servers into some 3rd party monitoring service, such as Datadog, Splunk, Scalyr or Sumo Logic.
  • Misc. Internal Tools and Services — These can be any tools that help engineers develop their applications. For example, we have a popular tool at Flexport that engineers use to auto-generate code skeletons for new services.

Like mentioned, however, the role of an infra engineer depends on the company, especially its size. Infrastructure at a large company like Netflix (~2000 engineers) is very different from Flexport (~200 engineers), or a small startup (~10 engineers). There’s a lot more emphasis on handling high amounts of traffic with low latency at Netflix (as you can imagine), than there is at Flexport.

My Internship Experience

First Week

The first week of my internship was very different from what I had previously experienced, or imagined. I wasn’t doing any programming; I was writing configuration for AWS, through a tool called Terraform (Terraform is awesome by the way, and I’m really glad I learned it). I was also surprised by how much AWS documentation I needed to read through, which definitely makes sense, given most of Flexport’s infrastructure is built on top of it. The work felt very different from what I was used to as a full-stack dev: connecting API endpoints, making UI changes in the front-end or writing application features. In a sense, I felt like I was re-learning software engineering again, with the process being very fun and exciting.

Projects Built and Technologies Used

The first part of my internship was slower as I focused on learning, but as I ramped up, I also started contributing more. I took on bigger projects and greater responsibilities. Here are some of the projects that I worked on:

  • Adding AWS Session Manager to services — Many teams wanted SSH access to their service’s containers for easier debugging, but it wasn’t secure practice to just hand out SSH keys. AWS Session Manager is a tool that allows engineers to assume an SSH-like session into a container. My first project was actually to set this tool up on all of our services.
  • Creating a Jobs Queue API — Our services previously didn’t have support for submitting and handling asynchronous/non-blocking jobs. Instead of setting up a background jobs framework for every service (with extra maintenance costs), we built a centralized system instead. Whenever services need to create an async job, they can just make an API call to Jobs Queue, a unique service that can do background job processing.
  • Building a Slack alert tool for failures in our CI/CD — We use Buildkite at Flexport for our CI/CD system. Rather than reminding ourselves to look at Buildkite’s web UI for any build failures, we built a Slack tool that will automatically notify the correct channels and people, whenever a build or deploy pipeline breaks.
  • Adding code coverage metrics to services — Code coverage is an important measurement of how well-tested a piece of software is. Generally, a high code coverage metric means the software is less bug-prone. We generated code coverage data for each service using third-party libraries, as a step within our CI/CD pipelines, and uploaded those metrics to display charts.

You may notice that most of the above projects were centered around services. It’s a really exciting time at Flexport to be working on infrastructure, because we’re actively moving towards a service-oriented architecture (SOA) and there’s so many projects around that.

Technologies

Here are some of the technologies that I worked with throughout my internship:

AWS (ECS, ECR, EC2, Cloudformation, CloudWatch, Lambda, SSM, etc), Terraform, Docker, Datadog, Scalyr, Periscope, Bash scripts, Buildkite, Ruby on Rails, Java, Bazel, PostgreSQL, Google Docs.

As you can see, there’s a LOT of AWS. We heavily rely on AWS at Flexport, just like most other companies our size. If you paid attention, I also included Google docs, because I felt like I had to read a lot of system design docs (written in Google docs), to better understand Flexport’s infrastructure.

Key Differences Between Infra vs. Full Stack Engineering

Having historically been a full stack developer and switching to infra, I want to talk about some of the observations, and differences that I’ve noticed.

  • I write less code. Instead, a significant part of my time is spent debugging issues, monitoring logs and reading/writing design docs. On a plus side, I feel like my system design skills have improved so much since I’ve started.
  • There’s little to no front-end development and UI/UX work, if you’re interested in that. After all, you’re not developing for the end user. Additionally, any logs and metric charts can be displayed with third-party tools, such as Datadog or Periscope.
  • My customers are different. Rather than being end users, my customers are other teams that rely on the tools I build. Because of this, my work isn’t driven by any product manager. There’s less of an opportunity for me to learn about the product side, but I also have more flexibility to design and build what I want.
  • I care more about developer productivity/experience. When considering new projects and tools, I think about: “How will this make our engineers better or more efficient?”
  • The kind of impact I’m making is very different to the company. One of the reasons I enjoy software engineering is because I can make a visible impact with the code I write. If I was in a product team, it would be very easy to just point and say, “hey I built this feature”, but on the infra side, it’s not so clear. On the plus side, it’s really cool to know that the infra changes I make will indirectly affect many, if not every, project out there.
  • There tends to be more blockers, such as from security and other engineers. As an infra engineer, I work a lot more with the information security team, because there’s a lot more risk of introducing a vulnerability through an infrastructure change. Not to mention, a lot of security is built on top of infrastructure. Engineers from other teams can push back on my projects too, especially when I introduce a new tool that changes their developer workflow.
  • Occasionally, there’s fewer blockers. For example, I have more infra permissions than engineers from other teams, so it’s easier for me to investigate and debug issues in the AWS Console. Though, more permissions means exercising higher caution as well.

Takeway

I really enjoyed my time as an infra engineer at Flexport. If I could go through the team matching process again, I would still choose the backend infrastructure team 100%. This doesn’t mean that I’ve given up full-stack development though, I still love full-stack and developing in JavaScript. Now I can do both though! Infrastructure used to seem so “magical” to me, but through my experience, it’s something I can better understand, and appreciate more. I’d definitely recommend devs and students reading this, to try out infrastructure software engineering (and Flexport’s backend infrastructure team). There’s a lot of interesting challenges to work on, and you might just really like it :)


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK