On Rails

Hilary Stohs-Krause: Scaling Rails with Small Wins

• Rails Foundation, Robby Russell • Season 1 • Episode 5

In this episode of On Rails, Robby is joined by Hilary Stohs-Krause, a Senior Software Engineer at Red Canary. They explore how engineering teams approach everyday performance work, from small Active Record optimizations to larger architectural decisions. Hilary shares insights from Red Canary's journey switching from React to Rails' native Hotwire stack, how her team tackled flaky test failures that were slowing down continuous deployments, and some strong opinions about custom linters.


🧰 Tools, Libraries, and Books Mentioned

  • RuboCop – Enforces Ruby style and conventions, with support for custom cops.
  • haml-lint – Linter for HAML templates to enforce consistent view code.
  • ESLint – JavaScript linter used for maintaining consistent code quality, especially in React.
  • HadoLint – Linter for Dockerfiles to catch common issues and enforce best practices.
  • SitePrism – Page-object model DSL for Capybara to reduce flaky system tests.
  • Sidekiq – Background job processor used in production Rails environments.
  • Turbo – Part of the Hotwire stack for reactive updates without full-page reloads.
  • Stimulus – Lightweight JavaScript framework for enhancing HTML with small interactions.
  • Hotwire – A set of tools (Turbo + Stimulus) for building modern web apps without heavy JavaScript.
  • Thinking in Bets – A book about better decision-making under uncertainty, by Annie Duke.


Send us a text

On Rails is a podcast focused on real-world technical decision-making, exploring how teams are scaling, architecting, and solving complex challenges with Rails.

On Rails is brought to you by The Rails Foundation, and hosted by Robby Russell of Planet Argon, a consultancy that helps teams modernize their Ruby on Rails applications.

[00:00:00.000] - Robby Russell: Welcome to OnRails, the podcast where we dig into the technical decisions behind building and maintaining production Ruby onRails apps.

[00:00:11.440] - Robby Russell: I'm your host, Robbie Russell. In this episode, I'm joined by Hilary Stolls Kraussi, a Senior Software Engineer at Red Canary. We talk about how our team approaches everyday performance work from small active record optimizations to much bigger architectural decisions. Hilary shares examples like switching from React to Turbo and Stimulus, and how our team reduce flaky test failures that were slowing down and breaking their continuous deployments. We also explore how engineering culture and tooling shape these kinds of efforts, especially in large rails apps where not everyone is focused on performance all of the time. Hilary joins us from Madison, Wisconsin, in the US. All right, check for your belongings. All aboard. Hilary Stolls Krauzey. Welcome to On Rails.

[00:00:52.580] - Hilary Stohs-Krause: Thanks for having me.

[00:00:54.320] - Robby Russell: Now, I always like to start with the question that gives a little grounding. What keeps you On Rails?

[00:01:00.440] - Hilary Stohs-Krause: A lot of things. So this actually came up a couple of years ago when I was on the job hunt. And I have a lot of friends in tech who were encouraging me like, Oh, I know you're just looking for jobs in Ruby, in Ruby and Rails, but you've been programming long enough. You could learn Python, you could learn PHP, you could learn whatever, Java. And I stubbornly really just didn't want to. I was like, I've only worked in rails. My boot camp in Rails, and I want to stay in this language, but I think more importantly in the community. I've been lucky to have spoken at a lot of Ruby and Rails conferences, and especially as someone who was a career changer into programming and a woman, and a queer woman. I really found it to be so uplifting and welcoming and empowering of a community right from the get-go. That's something that you just can't put a value on. It makes such a difference. I love the emphasis in the community on growing people within Ruby and Rios. There's a lot of support and programs for more junior developers. I've been a guide with the Scholar and Guides program a lot of times, and I just love how there really is this focus in Ruby on Rails on not just making the framework and the language better, but also making the people better and making us better developers in more ways than just how many methods do we know and how quickly can we produce an MVP.

[00:02:41.260] - Hilary Stohs-Krause: I really value that. I think it's a large part of why I enjoy my job so much.

[00:02:46.200] - Robby Russell: That's wonderful. You talk a little bit there around how the community is encouraging to junior developers. I recently gave a talk at Rails Conf. It was a couple of weeks ago, but there was an aspect that I was preparing for my talk that I was like, what was it about Rails that resonated for me early on? Because there wasn't much community infrastructure when I first started. It's always interesting to hear how people, based on when they came into the community or joined the community, it got wind of the community that they have these different perspectives. I was like, there was a community there when I first started, but it was IRC channels. Maybe a local meet up in the town that I live in. I can go meet these other people that none of them were working professionally with Ruby or Rails at the time. It was just their hobby thing. I think it's an interesting, having seen this transpire over a couple of decades now where having all this infrastructure, like when boot camp started, spinning up and rails would be the thing that they would teach them. It was like a really good programming language, Ruby itself, to introduce people into the concepts of programming because it was so easy to standard, really.

[00:03:46.020] - Robby Russell: At a high level, you can get your feet wet pretty quickly and then get more into rails. I think that's interesting. What I'm curious about with what you're talking about there is we're also now in an interesting environment where I don't think junior developers getting hired as often right now. Maybe this is something we could be talking about later in the conversation, but since it popped up, do you think that there's more that teams and companies could be doing to get over that hurdle again in bringing more juniors more proactively into the organizations?

[00:04:16.260] - Hilary Stohs-Krause: A hundred percent. I have a lot of opinions about this. I want to real quick touch on, you mentioned when you first got into Ruby and Rails, there was a local meetup, but it was small. I think that's something that I do want to highlight is I live in Madison, Wisconsin. At one point, we had three different Ruby Consulting firms in just Madison. There was a pretty robust meetup. We used to host Madison Ruby. We had that again last summer. There was a lot of community where I was locally, in addition to the broader national and international community. I think that definitely impacted the way that I first interacted with Ruby and Rael's folks. Also, I'm one of those odd birds who really enjoys public speaking. I know that's not the case for everyone. I'm also extroverted. There definitely are factors that gave me a leg up when it came to weaving into the community, for sure. When I did try new things or try to meet new people, I feel like everyone was super welcoming and encouraging. I do want to call that out. I'm sure there are people who got in to be your rails around the same time as I did, who lived in smaller towns or just didn't have that same level of access and probably had a different experience.

[00:05:27.700] - Hilary Stohs-Krause: One thing that I do really like about Red Canary, where I work, I like a lot of things. One thing in particular that I like is we're, I think, 11 years old now. Are we still a startup? A debatable, right? It depends how you define startup. We're also in cybersecurity. So Most of those identities don't often lend themselves necessarily to being deliberate about hiring more inexperienced folks, right? Yeah. There's often a big push for like, Okay, well, we want to get everything done as quickly as possible. We're tackling really challenging problems. We just need the best of the best, and we'll pay them more, and that's what we're going to do. Whereas I've definitely been a part of conversations over the last few years, and I do interviewing and things like that at the company. There's been acknowledgments of like, Hey, we're getting pretty senior heavy. We're getting pretty staff heavy. We should try to figure this out. We also started bringing in just this past spring, some of our first truly junior first professional job. Before we had juniors where I think we were looking for folks with 2-3 years of experience, and some of that was because we did not have the infrastructure as an engineering organization in place to really support people with less experience than that.

[00:06:47.720] - Hilary Stohs-Krause: I think that's something that's key that when we talk about bringing on juniors, it's not as simple as, Oh, we're just going to hire someone without a lot of experience, because if you're not being deliberate about creating a space for them to thrive, everyone's going to have a terrible time with There was a lot of deliberate work put into, Okay, do we have the structure in place that we can bring on someone who will need more guidance? Just will be less effective at the beginning? Can we create space for them to be successful, both in terms of their personal growth and also contributing to the company? I think she was the first one we heard through working with a boot camp directly. This is her first professional programming job, and she's crushing it, and she's awesome, and it's great. I think especially when we go into periods of downturn, like we're seeing now in the tech industry, I talk to people all the time who've been on the job search for months, approaching a year. It's a real struggle there for a lot of people looking for work, but especially juniors. I just always think when I talk to another junior who's like, Oh, yeah, I've applied to 200 places, or I've gone through seven interviews, I never get the job, or no one's even hiring.

[00:07:56.360] - Hilary Stohs-Krause: It's like, Where do we think seniors and staff come from? It's so short-sighted and it blows my mind. I used to be a partner at a consulting firm, and I loved hiring juniors for a number of reasons. One, they don't come in with all these preconceived notions of how things should be done. Sometimes that's a benefit. You get someone with a fresh perspective like, Oh, my last company, I saw this. Here's what we... That's also very valuable. But I love that they can come in and you're like, Okay, great. Here's how we do software. This is how we want you to do things. We really care about tests, or We always leave comments, or Nothing without documentation, whatever the case might be. They're so eager to learn. I feel like you can give them any story. And generally speaking, they will try. They also, in my experience at least, tend to stick around. It is harder to find a job. When you bring them on and you treat them well and you create opportunities for them to grow, they want to stay. My My first job out of my boot camp was an internship that turned into a full-time position, and I was there eight years.

[00:09:06.260] - Hilary Stohs-Krause: I think there's a lot of benefits to bringing in lesser experience to early career folks, apart from just the fact that someone has to train them. I think they really do bring a lot to the team and really help with morale. I think it frustrates me and it saddens me that more companies don't see the value in growing the industry, really.

[00:09:28.600] - Robby Russell: Something I'm curious about Was that safe to assume that when you started from boot camp and had your internship and started those eight years at that first company, was that in person?

[00:09:39.080] - Hilary Stohs-Krause
Yeah, it started in person, and then we were hybrid. I think we were going with this is that there are different challenges to bringing on more junior folks when it's fully remote.

[00:09:50.340] - Robby Russell
Or maybe not. Maybe not. Maybe there's just different sorts of challenges. That was a big shift for my company. For the longest time, we put off bringing on interns. But when we finally started doing it, it was like this light went off. I'm like, Oh, I think it was always less about what can they contribute. It was more like, Will we fail them? It was like, Do we have the capacity? We don't want anyone to come in and to see them struggling or floundering and like, Oh, we can't really help you. We're too busy elsewhere on client projects or what have you. And so there was this interesting thing where we realized as we're like, Okay, let's try this. We're going to try this on one person, see how this goes. We ended up hiring them. They stuck around for several years. When all we become a senior developer at our company, one of my favorite hires of all time. And then probably because I'm proud of being able to like that work for everybody. But the other part is that once we started doing that a little bit, we're like, Oh, this is a muscle that we need to regularly exercise, and we don't always have the capacity to bring on new junior developers.

[00:10:51.570] - Robby Russell

The internship model worked really well for us because when boot camp started popping up and there needed to be a place for them to go for a couple of months after they finished their curriculum, We're like, Great, send us two or three people, and we'll do this maybe three to four times a year. We'll have a couple of interns come in, and we'll exercise those muscles, and we're going to tell them we're not hiring right now, so it's a very clear expectation, but we want to do everything we can to help you get hired and set you up for success. But in order for us to keep working this muscle, we're just going to bring interns in for that fixed period of time, be very upfront that we're not going to hire you, and then just change the conversation quite a bit of it. That was our way of getting more comfortable with the idea that occasionally we could hire junior developers. But having interns come in just got the team into the rest of our developers who might have historically said, I don't have time for that right now. But when they were just regularly having to do that and be a mentor, answer questions, and then realizing that this is a good opportunity to improve your documentation.

[00:11:48.300] - Robby Russell
It's like, Oh, we didn't think of it that way, or let's revisit these things. That was just helping us across the board on a lot of other ways. Also had this weird side effect that we didn't expect how much confidence our more experienced developers had just because they were providing mentorship on a regular basis. If you get out of that mindset, and we don't do it as much as we used to now that we're remote, because I think, I'm like, Well, we don't really know how to do this as successfully as we did it when we were in person because you could see the person on their computer floundering a little bit. Like, Hey, can I help you? Because we realized that junior developers or interns would see everybody else being really busy. For them to ask for help meant that when they finally got to that point to do it, they had already done a lot to try to figure out things by themselves, but they were always trying to be protective of what they perceived as someone else being more busy on something more important. We were like, How do we remove that barrier in person?

[00:12:38.550] - Robby Russell
We were able to do that pretty well. Then when we got remote, it was like, Okay, we're going to have to have a lot of pre-scheduled Pairing time. There's a lot of different strategies we were able to implement to mitigate some of that. But we don't know when someone's sitting there being banging their head against the desk. I don't know, and I don't want to ask because I don't want to admit that I don't know what I don't know yet until they get some confidence doing that. Do you feel like your team has some secret sauce to that or something you're figuring out?

[00:13:06.340] - Hilary Stohs-Krause

I think some of it is culture. I do think to your point, my previous company, after I was hired, went too far the other direction, had too many juniors, and there was a project that another junior and I were working on, and we're just getting into that cycle of like, Well, I don't know if this seems good to me. Okay, well, let's keep doing it. Then a senior finally checked in on us. I was like, Whoa, whoa, whoa, whoa, whoa, feedback. There definitely is a balance there. I think a lot of it is maybe saying the quiet parts out loud, I guess. We just had two new people join my team, not what I would call juniors, but still new to the company. I hit them up because even if you have more experienced, like joining a company, especially remotely, you can have some of those same feelings. I was like, Hey, I'm starting a new ticket. It's not super complicated, but do you want to work on it with me? You can see our process from beginning to end. They were like, Yes, great. Let's do it. I think a lot of it is demonstrating psychological safety through your culture.

[00:14:08.300] - Hilary Stohs-Krause
One of the things that really drew me to Red Canary's culture is people are very comfortable saying, I don't know, even like architects. They're like, Oh, do you know? I saw you wrote this a couple of years ago. I'm not sure what we're doing here. Can you explain? They're like, I have no idea what I did there. It's probably terrible. Let's redo the whole thing. Also asking for help. We have a lot of different channels set up where some of them are explicitly for asking for help. We have a rubber Duck channel. Some of them are team department, where it's very common to ask for help or do a, Hey, I learned this new thing for the first time. Here you go. That was something we really tried to do at my old company, too, was just telling people, You're not going to know a lot, and we know that, and we're okay with that. That's why you're here is to learn. Something that I found also really helpful for true juniors was we wrote up a piece of documentation that just I said, When do I ask for help? It was like, Okay, have you looked for a similar code in the existing code base you're working on?

[00:15:09.000] - Hilary Stohs-Krause
Have you goog it? Because sometimes you get in your head and you forget to just do a Google search. Is this something that you need to get out immediately because it's broken on production or a client is expecting, or case might be, how urgent is it? Then we just had scripts for, How do I ask for help? Say what you've tried already. Say what maybe you're thinking of trying. Give a sense of when you'd ideally like feedback by. I'm really big on written and accessible expectation setting across all levels for a lot of different things, but I think especially for juniors, that's really helpful. I would tell people at onboarding. I always say, Look, I will not be mad ever if you ask me for help. I will never be mad if you ask me for help. I will be mad if you spend three days trying to do something and don't ask me for help. Yes, I think a lot of it is culture. I think a lot of it is written policy and process, not just unspoken policy and process. Then I think sometimes it depends on the personality of the person.

[00:16:12.060] - Hilary Stohs-Krause
Some people are going to be more comfortable doing that, reach out right away. Some people you will need, like you said, having maybe scheduled check-ins or having scheduled pairings sessions. Then the last thing I would say that I find really helpful with earlier career folks, but assigning them like a buddy. Someone is like, Okay, they're not your manager, but you're going to have a one-on-one, maybe twice a week at the beginning, and then you can move to once a week. They're the person who's going to maybe check in with you in the morning. Do you have everything you need? Are you stuck on it? Just creating that go-to person that can be their gateway to the rest of the team.


[00:16:48.100] - Robby Russell

I sometimes wonder if the buddy concept as the owner of my company, I'm like, When do I get a buddy? I get to be plenty of other people's buddy, but sometimes I show up, I'm air quoting to work, and I'm like, Where's my buddy to show me around? How does your organization work now for someone coming in? It's not the same as it is for someone coming in, but I'm always like, What is that experience like? But also the other part about having access to new people is that you get to learn a lot about your team that you wouldn't have assumed until you hear them talk about it and you're like, Oh, wow, you think there are companies like that? I'm like, I guess it must be. Not all these different realities that coexist. When we're talking about junior developers and you work with a bunch of staff and senior level developers as well. One of the reasons I wanted to have you on the podcast was to really dig into thinking of having a conversation about performance and who's responsible for that. Because when a day-to-day level, the There's a lot of small little things, small decisions that developers are making, like parts of Ruby or rails that they're using that can have a cascading impact on, say, the performance of the application.

[00:17:55.740] - Robby Russell

What areas or rails where do you think things like this tend to happen a little bit more often?

[00:18:00.800] - Hilary Stohs-Krause

In my experience, when we think about performance, we're typically thinking about very big picture problems. We have too many psych retries or the database is getting full, or we have too many M+1 queries. Even our indexes are not built properly or we don't have enough indexes. And those are all true. Those are all important things that we need to pay attention to, definitely. I think we missed the... Almost like we missed the trees for the forest in that way, in which we're looking at just the drone high-level perspective, and we're missing that there are a bunch of individual plants alongside all of the trees, to try and continue with this perhaps ill-advised metaphor. Yes, those bigger picture items are going to have the biggest impact, but there are a lot of small things that we can do that take way less time and effort and collectively can have a decent impact on your performance. I think a lot of the things we look at are system-wide performance, but in my experience, there are a lot of small things we can do that have outsized impact on specific pages or specific queries. Again, as that all builds together, you end up seeing system-level impact.

[00:19:27.700] - Hilary Stohs-Krause

We just tend to go a lot of times directly directly to those big issues. I think one of the side effects of that is that more people at higher levels tend to be the ones more often thinking about performance because they have the right access or they have the database experience or that thing. I think a lot of juniors, even seniors, aren't necessarily thinking about performance as much as they could because it seems like something that other people are handling. Or our site reliability people, that's their job. This little baby feature that I'm writing doesn't have anything to do with performance. I would argue that's a missed opportunity.

[00:20:10.560] - Robby Russell

Have you come across any cases where the framework maybe gives you too many ways to do something?

[00:20:18.180] - Hilary Stohs-Krause

I'm laughing because 100%. Yes, yes, yes. The reason I got interested in what I call everyday performance or small wins for performance, I was talking with a friend. We were talking about a conference talk that I'd given, and she said, Well, what conference talk do you want to see? I said, Well, I've done performance work, but it's always once something's broken, right? Generally speaking, it's reactive. I want to know more about what are the individual things that I can do in my daily programming life to be more proactive to help, if not prevent, at least delay some of these issues that we run into. That percolated, and then a few months later, I was like, Well, why don't I just get that talk and figure it out? And a big part of it, yeah, is we're focusing on, Okay, what are commonplace things that I can do no matter what my experience is as a programmer? A lot of it is being more intentional about which of the myriad Ruby on Rails methods we use to do similar functions. A big one that I think most people probably are familiar with or are more likely to be familiar with is size versus length versus count.

[00:21:30.000] - Hilary Stohs-Krause

It seems like, Oh, these do the same thing. If I call all of them on an array or an active record relation, I'm going to get the same number. But I think that's where we sometimes have a tendency to equate outcome with process. The outcome is going to be the same for all of those. This is true for certain other ones. If you have a big chunk of text and you're trying to replace all of the, I don't know, asterisks with hyphens, you could use TR, you could use G-sub, It's going to get you the same outcome. But TR is going to be about three times faster than g-sub. Same thing with size and length and count. So count is always doing a select count query. Size tries to be smart about what method it's using. Length will load everything into memory. If it's not already loaded into memory, and then get the count. But I've done a bunch of benchmarking on this and also read just different use cases. And really, depending on what you're doing, you You can see massive differences in the amount of time it takes. It does mean putting a little more effort into making sure that you understand what's going on under the hood for some of these common place methods.

[00:22:43.040] - Hilary Stohs-Krause

Also, sometimes it doesn't matter. The bigger the amount of data you're working with, the bigger returns you're going to see for choosing one method or the other. Then you also have to think about, I could use maybe more esoteric methods, but is that going to make the code much harder to read? Am I going to gain minimal time back at the expense of readability? Of course, with anything in programming, it's not that simple. But I think there are a lot of methods like that. The great thing about Ruby is we have so many options. I think it has made us perhaps, I don't want to say lazy, maybe too comfortable with just picking the one that gets us the right result instead of thinking about what are we actually doing with these different options that we have.

[00:23:32.680] - Robby Russell

Without having the stats or the details right now in front of me, I'm curious, using that present versus exist example, what trade offs are there between those two? Is This is something you can say, when does it ever make sense to use one or the other in the context of... Because I feel like in some ways, we're trying to express ourselves with Ruby, and Ruby makes it very easy to do that with, as you're saying, readability. Just like, how am I expressing a business requirement or expressing something in code here. When you think about benchmarking that while you're writing, should people be then having to decide between multiple methods to call here? Which one is faster?

[00:24:10.080] - Hilary Stohs-Krause

I'm hearing like, okay, well, or I could see people listening being like, I'm not going to benchmark every single method call. No, you definitely not. That would not be an efficient use of our time. I think some of this can be handled with just your department or your company's code standards. What are your code conventions We've got a couple of repos, but we're largely a monolith application at Red Canary. For us, generally speaking, these small things are going to add up more quickly, perhaps than they might for if you're working on microservices. Some of it is thinking about, Okay, well, what makes sense for the work that we're doing here? Because that might look different than for someone who's doing it in a different way. Taking the time to figure out, Okay, what does make sense? Do we care about counter size or length? Is the potential impact for these different scenarios worth us taking the time to make a choice? Or can we just blanket say, Hey, we prefer this over this unless you have a good reason to use the other one? Count length and size is when where I typically default to size because it does some of that logic under the hood.

[00:25:21.500] - Hilary Stohs-Krause

There are times where you want to use the one that is less performant because it needs a product requirement. We had this come up We were rebuilding a dashboard, like an index table dashboard, let's say, for cats. And product was like, Okay, I want someone who comes here at a glance to be able to just see some high-level data about their cat collection. So Maybe we can post little boxes that have the current number of cats for each breed, like their top three breeds. And we're like, Okay, yeah, sure. That makes sense. However, some of our customers have hundreds of thousands of cats. That's going to be very slow if we're trying to get that real-time count every time they load the page. That was one where we worked with product and we determined, Okay, yeah, we can just cache those numbers once or twice a day, put a little note that says this is not necessarily reflected in real-time. That should be sufficient for the way that the customer is going to engage with this data. That's one where we could, if we already have the cats loaded in memory, we can just call size or length and get that non-real-time count.

[00:26:32.660] - Hilary Stohs-Krause

But there are other times where maybe they push back and they're like, No, we've heard a lot of feedback. Our customers really want to know to the second, how many main coons do they have? In that case, we do want to specifically go back to the database, even though it's more expensive, even though we already have most of our cat collection loaded in memory, because the need for real-time data supersedes the performance aspect in that scenario. I think there's a lot of considerations, and I think ultimately, you can't make a good decision about what to use if you don't know the trade offs of your decision.

[00:27:08.200] - Robby Russell

You brought up the caching maybe once or twice a day of some data that might be an expensive hit to load that real time. You're like, Well, maybe we don't need to have it be up to the second. In that scenario, are you, is Red Canary take an approach to do that proactively for them? Or is it due default to page load and then it gets cached at that point? And so the subsequent visits are then leaning on that cache that's not going to expire for 12 hours or whatever. Is that the approach, or do you have a regular thing that goes and fetches that data, caches it so that when someone does log into their CAT interface and sees the dashboard, that the data has already been optimized for displaying that really quickly when they log in?

[00:27:48.020] - Hilary Stohs-Krause

We've done it both ways, depending on the scenario. It's one of those situations where we have customers of very different sizes with very different... A wide range of data that we're processing for them. We have to build for the biggest customers. I think oftentimes, group's benefits for smaller customers because everything's very fast. But yeah, so the example I'm thinking of for the cat dashboard, we did end up doing a database storage cache for those figures and had a job that runs overnight. Other cases, we do a lot of scalability testing and performance testing with pretty much any feature we release. There have been times where we're like, Okay, well, one or two of our biggest customers were hitting about nine seconds for page load. Everyone else is under that. That's with just loading it in real-time. The determination was like, That's sufficient. That meets the goals that we're looking for for this page based on how often it's used, how urgent the data is, all those kinds of things. We've done it a bunch of different ways. I think that's something, too, that it can be easy to fall into a trap of Oh, we cached it in the database here, so we're just going to do that every time we have a slow page, or we added an index here and it really sped things up, so we're just going to always add indexes.

[00:29:11.480] - Hilary Stohs-Krause

I think a lot of performance is it's just picking which trade-off you want to live with. I think the biggest takeaway I could have for anybody is it really helps to know what those trade-offs are before you make the decision, because I think we do sometimes default to what generally works, but then you can run into issues when you have a scenario where it's not the right thing to do.

[00:29:33.600] - Robby Russell

That's interesting. There's been a couple of projects that we've worked on, work in the consulting space, and I get to expose a lot of different strategies. I remember, I don't know why this light bulb went off in my head one day when I saw that there was a team that was that type of scenario where they're like, Okay, we're going to start caching some of this data ahead of time. So maybe we're doing it hourly or once every... Maybe we don't need to do it over the weekend because no one's logging into the dashboard on Saturday. So maybe we don't need to be running this thing every hour for all these different clients. It's like multi-tenant database or something like that. Then you're going through and you're running it for everyone, your clients, and you're fetching all this data just in case they log into their dashboard, and they may or may not ever do that very frequently. Then there's a lot of other processing that's happening that's not being used. One of the things that I saw a company do, which I thought was interesting, was they did a hybrid approach where they were able to identify clients of a certain size, and they would pre-optimize the cache for those clients.

[00:30:28.620] - Robby Russell

Then let everybody else just hit the rails until it became a problem for those clients. Then they would get automatically bumped up in the next year. We got some pre-capture caching happening. Then they would just change the text a little bit. If it's a certain size client, then we're going to say the visual... You weren't having to make performance decisions for everybody every single time. It was more like it's contextual. We can make this work. It's up to an hour for these really large clients, and they're fine with that. We don't need to keep running this for every single client and getting all this data that may or may not be used because needed to take care of that really big client. I don't know how well that may or may not work there. I thought that was an interesting strategy. How do you codify that to think about how you treat different-size clients or when you're building these reports?

[00:31:13.100] - Hilary Stohs-Krause

Yeah. My first thought with that is I can definitely see the benefits. I'm thinking about our support folks. They now have to track who has what so that they can do the right screen sharing if they're helping debug or send the right screenshots or documentation. You're doubling all of those pieces, which isn't good or bad, just something to think about. I think also then if someone does hit that threshold, which we do have customers who will onboard a new integration, and suddenly that might kick them up, and then they're like, Wait, why did this change? Why does this look different? That's an interesting approach. I think that takes more management, obviously, but I could definitely see there being benefits to that.

[00:31:55.120] - Robby Russell

That particular scenario, they were trying to make that as, not transparent, but It wasn't obvious to anyone because if there's more of the data would be within this was updated 45 minutes ago versus this is real-time data. I think it was that scenario. Another thing that I've seen some companies do is oftentimes someone signs into an admin panel or to a dashboard tool, then you log in and you get directed to an index page for some dashboard page. That may or may not be what they're trying to do right now, which is to look at a bunch of data. But because it's the landing page of where you sign in, that just ends up being the biggest hit to the database, even though they were just trying to go look up a user. It's like, why are we fetching all this information too quickly when they're actually not even... Their task at hand is not to go look at the reports. They're going to look up a customer in their database or something.

[00:32:44.680] - Hilary Stohs-Krause

Oh, yeah. This is something we've talked a lot about recently with some of the features that I've been working on is I think we've moved into our data-driven era where it's like, Okay, we know that this feature or page or area of the product isn't where we want it. What are we going to do instead? One of the first steps that we've taken for a couple of things is just adding logging to determine what are people doing right now? Because it's so easy to fall into those assumptions, whether you're engineering your product or sales, whoever. You can ask people, how do you use it? But I think the best data is, what are they actually doing? Because people sometimes think, you can always stress someone to accurately describe how they're interacting with software. But logging doesn't lie. We found that while this was a very important part of the site, it dealt with, let's say, it's permission setting. So very important, but not something you're going to change very often. This came up as part of the conversation around, Well, what's an acceptable load page time for this page? The data showed us that most of our customers were checking it out maybe once a month.

[00:34:02.500] - Hilary Stohs-Krause

This was a page that they didn't go to view. You don't really check up on your settings. Typically, you go if there's a specific one you want to change, or if there's a new setting you need to configure. Given that, we were like, Okay, well, let's make it easy for them to do that. We're not going to paginate. We're going to let them use multi-select to grab multiple patients at a time, change the setting. We're going to alert them when there's a new one. But we don't really need to optimize this page for regular usage because that's not its purpose and that's not how people are using it. That really freed us up to make some decisions that were better for engineering without sacrificing the user experience. We had another page where we're in the middle of revamping another dashboard page, and it was like the scenario you're describing. We used to just show the most recent 30 books that someone entered into their library. But most often people were going there to try to find a specific book that they wanted to check out. It wasn't a page people were going to be browsing on.

[00:35:09.860] - Hilary Stohs-Krause

They really were like, I'm here for a specific book that I need to look up some data on. We just took that away. When you go to that main page, you don't see the list of the 30 most recent books anymore. It just starts with the search and filter form and has the first step that you make, think about why you're there and be able put that into the search because that's what people were doing anyway. They didn't want to sit there and wait for the first 30 to load with all of their join tables and adjacent data because most likely they didn't care about the most recent 30. So then, again, was one that made the experience much smoother and more aligned with what customers were doing anyway and made the page load faster and took some effort out of the engineering work for it. It was just this big win all around because we really stopped to think like, Okay, to your point, what does someone want to do when they get here?

[00:36:03.960] - Robby Russell

And by default, we've just been generating these index pages. Here's the page, and here's the most recent. It's a sort by ID descending or whatever, creation date or whatever. And then which ones are most recently added or modified, what breeds were just recently added to our cat database. So anyone listening out there... You also mentioned talking about logging and being like, if you had logging, you'd be like, well, a lot of people are going to this page and looking at the list, but then you're like, oh, what the next step Then they went to go filter for something. And then that page load for maybe displaying a filtering thing is probably pretty quick to render because you're not loading up much data or any at all. So for what it's worth, for anyone sitting out there. And then you can always link over to see the most recent ones or something.

[00:36:48.080] - Hilary Stohs-Krause

Exactly. When I think our customers tend to be... They're in technical roles at their companies. It's annoying when you're waiting for a page to load and it's not even going to show you what you want. But if you go through and fill out a a filter form like, Okay, I want it with this ID or this title, this author, whatever, and then you hit submit, and it takes a few seconds to load the search results, we understand that pattern. We expect that pattern that is not going to surprise or annoy us because it was a deliberate action we took that we knew would maybe take some time.

[00:37:24.140] - Robby Russell

This episode of Onrails is brought to you by Formwith, the form helper that finally ended the debate. For years, we We argued, was it form for or form tag? Which one worked with models? Which one handled plain params? And why didn't either feel quite right? That's why I choose Form With. It just works. Model or not, remote or not, HTML or JavaScript, your call. No more remembering two methods, no more passing weird blocks or URL options just to get it to behave. Just forms. One method done. Form With, because life is too short to explain form for again.

[00:37:57.430] - Robby Russell

Can we go back to our earlier conversation a little bit around junior and senior-level developers and trying to find those opportunities so you're not waiting for those decisions to be made at, say, a system-wide level? And it even sounds like, how did that conversation about even going from the index page to a search thing, was that a system-wide decision or is that something you were able to... Do you recall how that came about?

[00:38:19.840] - Hilary Stohs-Krause

Yeah, it's a great question. It came from product. There are folks in product who used to do technical work at the company, which is so great because then they have a built-in, They have the historical knowledge of like, Okay, this is a slow area of the product, or this is a light area of the product with tech debt, that thing. It came from product, and I think a big part of it was because this particular book index, if you will, was written a long time ago, known for being an issue with some of the bigger customers. That was when we went in knowing that there were performance issues we needed to address. I think it really opened up the creativity to think about, Okay, well, what do we need on this page and what don't we need? We have this idea in our head of what an index is, and we just default to that. Most of the time, that's fine. But there are times where it's just unnecessary in actually creating a worse user experience, which was the case here.

[00:39:17.760] - Robby Russell

How do you think about... When you have those conversations with product, not everything is a technical decision, right? Do you feel like as a team, you're now spotting things, or are you leaning on the product team to come with these ideas and ask the question?

[00:39:33.920] - Hilary Stohs-Krause

It's definitely both. I was on a podcast last year and I was talking about monitoring and logging, and the host made some comment like, Well, everybody has that. I was like, No, no, no, no, no. I have worked places that did not have that. Because we're in cybersecurity, our product has to be working 24/7. To facilitate that, we have a lot of different monitoring and alerting and paging set up. The goal with a lot of that is to try and catch some of these issues, especially for a new workout. Some of the features that were written 10 years ago when we had much smaller customers, we hadn't scaled yet. It was very much like, get stuff done as quickly as possible because it's a startup. Some of that we're aware that there are issues that we are actively addressing. But for new stuff going forward. A, I think we're doing a lot more performance testing before the feature is released to catch these issues before the customer has to deal with it. Then I think there's definitely more proactive monitoring of the monitoring. Looking for when is there a page where the page time is starting to creep up a little bit.

[00:40:35.020] - Hilary Stohs-Krause

Maybe we should take a look. Did we make a change that inadvertently introduced N+1 queries? Let's take a look. We are experimenting with a tool right now that would give us a lot more insight into our Postgres databases and where we might have problems there. I think that's really going to level us up even more to catching potential for issues before before it becomes a full-blown problem, and being able to address that.

[00:41:04.220] - Robby Russell

I want to zoom out a little bit, actually, on that. Earlier, you mentioned that you started baking in some more standards, like documenting what those might be. Is that something like RuboCop that your team is using or something like that?

[00:41:15.880] - Hilary Stohs-Krause

Yeah, we have a multi-prong approach for conventions and standards and things like that. We definitely use linters. We love linters. We have RuboCop, and we have our own version of that where we write custom Cops for different things. We use Hamlint, we have ESLint, we have HetaLint for Docker. I think we have one more that I'm not thinking of. Love linters. Fantastic. I was definitely ambivalent in the sense of love-hate with linters before coming to Red Canary. I'm a true believer now. I have been converted. I've seen the light. I think what I love about linters is I acknowledge that it can be frustrating if there's a new rule that you're not primed to notice or pay attention to. Then you run the linter and it yells at you for a bunch of things, and you have to go and change it. Yeah, that's obnoxious. But you will eventually figure out what that rule is, and you'll get better at just doing it. I love linters because they not only standardize a lot of things, but the rules are written down. This is especially useful, again, if we're talking about more junior folks or someone new to the team, they don't have to guess at what the standards are because we have them documented through the cop.

[00:42:32.540] - Hilary Stohs-Krause

It also takes away some of the opportunity for code review to veer away from. I think without linters, it can be easy for people in code review to inadvertently try to enforce their personal preferences. It doesn't always feel good and can lead to arguments. But when you have a linter, it's very objective, and it's not that you did something wrong or that a person thinks you did something wrong. It's just like, Hey, this is actually how we do it. It's just very impersonal, which I think is great. So linters are fantastic. We also make use of GitHub actions. We have some scenarios where it's either too complicated for a RuboCop rule or it's more nuanced than can accurately be captured in a RuboCop rule. But we still want to make sure that the person is aware that this is maybe an empty pattern or that this could have bigger consequences than maybe they realize. We have some scenarios where there are certain files that if you touch that file in any way, we have a GitHub action that will automatically post a comment on your PR that says, Hey, you touched so and so file.

[00:43:46.300] - Hilary Stohs-Krause

Here are reasons why we typically don't do that, or if you did do that, you also need to alert this team that this is changed, or whatever the case might be. That's also a helpful way to enforce some of those standards or conventions. Then lastly, we have a massive document that has our coding standards and conventions thrown out. That's helpful, especially for things that are maybe higher level. There was a period of time where whatever programming language somebody thought was best for a certain feature they could just use. We ended up having, I think, five or six different languages in our main repository. Yeah, good time, especially when you hit on one of those files. There's been a concerted effort to standardize everything. We got rid of coffee script recently. We had a couple of files hanging out that were coffee script for some reason. We are slowly converting our React files to go all in on like Turbo and Stimulus and Hotwire. There was one other one that we... Did we maybe have some random go hanging out somewhere? I don't remember. But that's the thing that we could write a copy if someone tried to add a new copy script file.

[00:44:53.200] - Hilary Stohs-Krause

But it's more a scenario of just letting people know not only here's what we're doing, but here's why. And so that That one has everything from what specs do we prefer, what languages do we use. We have documentation about the best way to write a new migration, all of this stuff.

[00:45:11.220] - Robby Russell

For a little bit of context for our listeners, how large of an engineering team is at Red Canary? People listening, just for that context, I'm like, Well, we all know. Well, if your team is 5-10 people, it might be very different than how large is Red Canary.

[00:45:24.140] - Hilary Stohs-Krause

Yeah, for sure. We are, I want to say around 400 people. It looks like in our Our main engineering channel, which is cross-team, we have about 100 people. I do think my old company was much smaller. I think there were maybe 12 of us at our peak. Even then, having the documentation is super helpful, especially for a number of reasons. One, you're always going to be bringing on new people. Two, it might be documentation around conventions for a part of the code base that you don't touch that often. Then maybe nobody touches that often. Or maybe that's always Jill is part of the code base, and then Jill goes on vacation and someone has to touch it and it's like, Well, how does Jill normally do? I still think even with a smaller team, it can be really valuable. Some of it is also... A lot of times we'll do both. There's one initiative that I'm taking on, which is our class structures for our models are all over the place. Some of them have the scopes under the delegate, some of them have scopes scattered throughout, some of them have all the validations together, and all the validations are under the constants. It's very hard to find something if you're going to a new class because there isn't really a structure to it.

[00:46:37.040] - Hilary Stohs-Krause

We came up with a structure we all liked. We're going to add a rubocop to enforce it, and then we're going to go through the to-do list and convert everything. This is all a lot of work. It is a lot of work. I totally get if people are hearing this and being like, Well, there's three of us. When are we going to have time to do any of this? I think it's one of those classic spend three hours now to save 20 20 hours over the course of the next year. But that one is on my plate, and I have been continually being like, Oh, there's other stuff to do. I'm not perfect at it either.

[00:47:09.680] - Robby Russell

How has your team encouraged a culture where, say, small improvements like that are part of the flow and not just something you do when you have a little bit of extra time?

[00:47:19.840] - Hilary Stohs-Krause

I was actually just talking with my manager about this because I felt like there was all this documentation that I wanted to write, and I just wasn't getting it done. One of his suggestions was block off time on your calendar so no one bugs you. We also have built in focus time for engineering, and so that could be another good time to do it. But really what I found to be helpful for me personally was he said, Also, especially if it's related to ongoing sprint work, just make a one-point ticket for it. Then it's documented, pun intended. It's part of the sprint. It maybe feels more official work because it is. I found that it's been really helpful to make me feel like I have more permission to work on this, even though it's very widely supported at the company. I think also one of the things that helps with our culture around addressing tech debt and writing documentation and things like that is that all of the leadership in engineering, all the way up to VP and CEO, are all people who have been engineers, which I know is not the case everywhere. You just have this built-in innate understanding of like, Oh, yeah, this has value. This is important. This could kick us in the butt down the line. Of course, you always want more time to work on those things, right?

[00:48:37.950] - Robby Russell

Of course. You mentioned earlier also around when you're looking back at, say, performance things and you ship something, you're like, Okay, maybe I'd introduce some... So you have this logging and metrics that you're keeping track of and monitoring things. What's the process like? Is it you or is there people on your team that are responsible for going and looking at those metrics? Because a lot of companies will collect a lot of data, but someone needs to go look at it and understand what's going on. Unless you've come up some clever way to help it tell you, Hey, we need to address this issue because something got deployed and something changed by 15%, and it automatically assigned a task to someone to look into it. Are you at that point?

[00:49:16.340] - Hilary Stohs-Krause

In the middle, I think. We have a couple of Slack channels that our error reporting software will directly post to, and there's a new error that's logged. It's encouraged now after you merge. We do continuous deployment. After you merge something to production, just hang out in that channel for a little bit and see if any new errors are sparked. That service will try to guess the suspect commit. It's right sometimes. But you can usually tell if it was something you just pushed that's causing the issue. The way that we handle on call is week-long shifts. And the general expectation is that any sprint work you get done is a bonus. So that's really meant to be a time to spend on things related to site reliability, whether that's adding new documentation, whether that's just going and digging into charts and looking and seeing has anything changed? Is there anything that is looming that might become an issue down the line? What's our database health look like over the last four months? Are there any trends that we should be paying attention to? I think that's probably the closest thing we have to built in time to do that.

[00:50:27.200] - Hilary Stohs-Krause

We do also have some folks on the team who just I really like to enjoy that. First thing they do, they log on, they're like, I'm going to check out what new error exceptions got posted to slack and see if I can tackle any of them. The monitoring, too, we do have some monitoring that will alert us if there are trends that are climbing too sharply. I say that, I'm pretty sure we do. I don't know that there's a perfect way to do it. I would love if we could get to a state where it would just automatically assign to a team like, Hey, your cat index page was 5% slower over the last two weeks or something. But I think we're not quite there yet. Compared to other places I've worked, we have a lot of insight into... If you have a question, it's generally pretty easy to get the answer about performance.

[00:51:15.060] - Robby Russell

When we were preparing for this conversation, you had mentioned that your team has had to deal with things like CI, bottlenecks, or flaky tests. Do you recall that conversation we were talking about?

[00:51:26.130] - Hilary Stohs-Krause

Yes.

[00:51:27.640] - Robby Russell

What were the symptoms of your flaky test suite at the time? What have you done to address that?

[00:51:33.480] - Hilary Stohs-Krause

That's one of those ones that I feel like some of these things are so tricky because typically, they're not jumping from a Monday to a Tuesday. It's not like, Oh, everything's fine. Our CI is great and super fast, and then suddenly it's terrible. It just happens gradually. You get used to it and you don't notice it until you realize that you've been kicked out of the merge queue four times in a row for tests that are not related to your code. Again, we looked at the data. We had someone who pulled a bunch of data from our CI service, and they were able to calculate the amount of lost developer time to getting kicked out of the queue, which is also really valuable data for getting buy-in from the people who allocate your time. It was not great. It certainly wasn't where we wanted to be. We started an initiative to tackle flaky tests. There were a couple of ways that we addressed it. One, again, we created an epic. So anytime that flaky test was identified, people could throw it into a Slack channel, it would get converted into a ticket in this epic. People were assigned to it who were like, If you were on call, you could tackle some, but also if you were a floater in between epics, that was a great time to work on the flaky tests.

[00:52:52.520] - Hilary Stohs-Krause

I think it was one of those things where it really built momentum because you start to recognize some of those patterns as an organization. Like, Oh, We have a bad habit of test data leaking between specs, or We are not accounting for time zones enough, or We're matching on order when really we just care about content. Once we started to pick up on some of those patterns, then it could be It started to get faster to be like, Oh, yeah, I've seen this before. I know how to fix this. I can go ahead and tackle it. I believe we've shaved our time loss to flaky specs in half by now, and there's still a little ways to go, but we We got to an acceptable level, and now we're working to get to an ideal level, I think. But I think it was also just a really good exercise for everyone. This is really an example of the impact of not site performance, but developer or development performance, and how that can slow everything down. It wasn't really any one person spending hours and hours and hours on this. It was just everyone picking up the ticket here and there, knocking it out, making everything faster.

[00:54:03.560] - Robby Russell

I'm curious with the flaky test, you mentioned things like time zones. Tell us a little bit more about the types of other things that you recall that our listeners might be like, Oh, maybe that actually resonates with something that we're dealing with.

[00:54:15.580] - Hilary Stohs-Krause

Yeah, one of the ones I fixed was definitely a case of we had an expect statement that a certain method would return an array with three items in it. We use our specs, so we were using to equal an array with these three items, which is order dependent. That's when I saw a few times. We changed that to match array instead to just match on the content, and that test was solved. I think probably surprising to no one, some of our flakeiest tests were our front-end system specs. Something that we've been using to fix those that I actually just used for the first time last week, and I think I'm in love, is a gem called SitePrism. I I don't know all the technical details of what it does, but my understanding is that it runs the browser piece on a separate thread than the spec. It comes with all these built-in features. You basically set up the elements of a page that you want to interact with in your specs, like the cat species drop down, let's say. You can set up an element like that. Then in your spec, You get some of these built-in helper methods, like wait until cat species drop down is visible. Whereas before we might have had to do sleep, too, or constantly recheck until it evaluates the tree. It cleans it up a lot, and the error messages are better. I really like it. I'm glad we're using it. But I think that we're starting to rewrite. At least for my part, I'm writing any new system test using this, and then I think we'll start to back convert some of our older specs to use it. But we use that for some of the flakiest ones, and it deflaked them.

[00:56:05.460] - Robby Russell

You said that was SitePrism, right? I see the gem here. I just pulled up the GitHub read me. I'm looking through this. I can see that you can do things like, wait until menu is visible. Wait five seconds? I guess I'm assuming if it takes more than five seconds, it's going to throw an error, and that's maybe an indicator that there's some performance issues, maybe. Or is it retrying every five seconds? I know you just started using this last week.

[00:56:27.440] - Hilary Stohs-Krause

I am not sure. That's a great question. But yeah, and this was a gem that I believe one of our newer engineers had used at a former workplace and took on the flaky specs as one of the things that he was going to champion. He was like, We should really try this. We should really try this. We should really try this. Then it was like, Okay, yeah, go ahead, install the gem, do some proof of concept, see if it works. It did. Speaking of development performance, I'm really big on code that has readability and clarity and is easy to digest, even if you're unfamiliar with it. I found the site prism set up to be just pretty intuitive to read. I think that's also a big win.

[00:57:14.540] - Robby Russell

Awesome. I'll definitely include links to that for everybody in the show notes so people can take a look at that a little bit more in-depth. I don't have to poke around that myself. It's maybe a good time to pivot a conversation related to the front-end. As I know that your team, as you mentioned earlier, you're migrating from React to Hotwire and Stimulus and such or Turbo. At least, can you talk a little bit about what drove that decision to change your front-end tooling?

[00:57:42.680] - Hilary Stohs-Krause

Yeah. I mean, as far as I know, I wasn't part of the initial conversations. I know this is something they've been wanting to do for a while, but my understanding is it was really just wanting to go all in on the rails ecosystem. It was like, Okay, we've got built-in tools that work with rails. Before, I think they went to React because they didn't feel there was really a great alternative that could give the same benefits. But now we have one. We tend to hire folks who have Ruby and Rails experience, not exclusively. We definitely brought other people on and taught them Ruby and Rails. But a lot of the folks who come here do have that. I barely use React. Other people I've talked to have barely used React. It was getting to a point where we would have to make changes to React code, and it took took longer than any of us wanted to because we didn't have the familiarity. Interesting. I think that was probably part of it. But also, yeah, just really going all in on Ruby and Rails and being like, Hey, if we have these things that work natively with what we're already doing, and the team is going to be more familiar with the syntax, and we know how to test it, let's just do that.

[00:58:49.160] - Robby Russell

It wasn't necessarily, to your knowledge, driven by any external pressure or performance-related. On the end user, it was maybe optimizing your developer's current skill sets and most likely future skill sets and betting on the idea that this isn't going to be a first-class citizen within the rails ecosystem. Let's not have this extra thing that we don't have that many people around that feel as, say, competent or are going to be as productive. That's what it sounds like.

[00:59:14.180] - Hilary Stohs-Krause

I think those are big parts, but I do think performance was, now that you say that was probably a factor, because I'm thinking about the parts of our product that still rely heavily on React, and they are areas that we sometimes run into bottlenecks and loading issues and things. I don't know if that was the primary reason, but I do think it was a contributing factor.

[00:59:35.320] - Robby Russell

If someone's listening and considering a similar tech stack shift like that, what should they know going into it?

[00:59:41.800] - Hilary Stohs-Krause

Converting code, especially if it's a feature that is heavily dependent on React, the conversion is not necessarily quick. Something we found that really helped. We have put a lot of effort into creating a view component library. That's because our two designers that work with my team are also engineers, which is fantastic. Highly recommend if you can get it. They built a lot of view components. That is one way that I think in our experience really has sped up the conversion process. We just took one of those index pages I was telling you about that had a big table on it. That was all in React. When we were going to go in there and redo the whole thing, we're like, Okay, well, we're going to also convert it out of React and put it into one of the UI components. I didn't know how long that was going to take. I was the tech lead for this project. I asked one of our engineers on the project who had the most experience with React, and I said, Hey, can you do this conversion? Because I feel like you're going to have a better... It's just going to be faster. It's going to be better if you do this because you know both. It took him way less time than I thought it would and way less time than he thought it would. A big part of that was because he didn't have to write everything from scratch. He could just create a new instance of this view component and make sure that we turned on some of the same features that were being used by the React code. It really was less of a conversion and more of a modeling after, and then we could just delete all the React code. That has definitely made it a lot faster because then we're not writing all of the new JavaScript for each time we're doing it. It's just baked into the component and we're just calling what's already there with minor alterations.

[01:01:31.580] - Robby Russell

Is it a safe assumption, based on the way you're describing that, that you have been using something like React on rails? Is React within the rails app, or you mentioned it's a monorepo, or is the React actually all that in a different repository.

[01:01:45.360] - Hilary Stohs-Krause

It's in the repository, yeah.

[01:01:47.500] - Robby Russell

Are you then able to leverage much of your existing testing and QA process to make sure things are still... I'm assuming you're trying to keep a fairly high level of feature parity between these different interfaces?

[01:02:00.520] - Hilary Stohs-Krause

To a certain extent. We've been tackling those based on when the feature gets a refresh, which also helps because then we don't have to worry about parity. It's more it's just what are the features we want now in this new iteration, make sure those work. I think that also is a way that speeds up the process because we're not trying to do a one-to-one conversion. We're really just replacing it, which definitely helps. For specs, and this is another reason I think I'm excited about Site Prism is because so often our system specs were flaky, that's definitely an area that we can and are improving upon is the coverage with our system specs. It's also easier to write those with using the Vue components and with using stimulus because, again, it just feels more like it's all part of the same ecosystem. You're not as reliant on just system specs the way we were with React.

[01:02:59.380] - Robby Russell

To clarify Why, when you mentioned that you're moving things away from React when that area of the app is going to get a refresh, what constitutes a refresh? Is it a pretty big undertaking type of thing? We're rethinking the layout here overall. Or is it like if the next time we have to make a change to something, is it still likely that someone's still going in there and making changes to the React code right now? And that's going to be done in parallel until something bigger come along? That warrants going? I'm just thinking, how do you eventually get rid of React in that scenario if there's a bunch of areas like, we're probably not going to touch that for two years. And so now you're maintaining two different paradigms there.

[01:03:38.340] - Hilary Stohs-Krause

Typically, it's been when we're doing a larger overhaul. So we have new designs or we want to add quite a few. Usually, it's new designs and new features, something that we're going to dedicate an epic to that piece of the product. There are people still writing React code, and it's one of those things where everyone's sad about it for that reason you mentioned, because, okay, great, now that's more code we're going to have to change. But I think it's the most efficient way to do it, because then, like I said earlier, we're not reliant on doing that one-to-one parity. We're actually building what we need in the new code. Also, that's when we have a lot of hands on deck for testing and evaluating and code. It takes longer to do the replacement, but it ensures that when we are doing the replacement, it's going to go as smooth as possible.

[01:04:29.280] - Robby Russell

I like that. Well, I understand that, I should say. When I talk to different teams, when they're trying to figure this out, it always seems that... For those listening, it's just like, maybe this will connect with you a bit. All right, we want to start experimenting with turbo, what have you. Some new insert, new technology, new paradigm that rails or the ecosystem has given us. On this next new set of features, let's try it there. Quite common. I completely understand, well, this is a new thing. Let's build this in the new thing. We'll see how we like it. Then maybe if this works out, we'll go back and then retrofit everything one day that we have. What I've actually seen happen is you might do that, you learn a little bit, you start doing on some new stuff, and then you're still maintaining in multiple ways. Then two years from now, you still haven't gotten rid of the old thing, you got this new thing. There's another new thing, and now you have three different strategies that you're maintaining. One of the things that I've tried to advocate for some teams that are experiment, rather than do this on the new feature that you want to get out the door, the product team has some new software or new area that you want to build out. Try retrofitting something small that you know works. You know how to QA it, how to test it, and see if you can replace that right now and see what works because you know what the expectation is. You're not learning what the feature should be in parallel to learning this new technology, and you're going to clunkly get it working enough and then ship it, and then you're like, Well, we learned from that. I don't know what the right answer to this is, but I'm always trying to be like, Let's go clean up the existing stuff a little bit. If that sounds scary to you, my argument is then I feel like that maybe helps highlight some things that you don't feel confident about your team to catch the issues that you might pop up. It could be lack of test coverage, QA process, what have you. If you're afraid to break something, That might speak to some other underlying thing. Anyway, I'm getting off my soapbox now.

[01:06:18.980] - Hilary Stohs-Krause

No, I totally agree. I totally agree. I have definitely been in scenarios where someone, like the consultant company, there was somebody who was always into new hotness. So every new product, they're like, We're going to do it with this one. And I'm like, But no one else knows that. And like you said, now, depending on which client we're working on at a given time, we have to know three or four different ways to ultimately do the same thing. So I always joke that I am not an early adopter. I'm like a solid middle. I'm a medium adopter. But I want to wait for everyone else to figure out the bugs and work those out, and then I can make a better decision about what to use. And I can see why developers, especially, might frustrated if they're like, Oh, but this new framework came out and it's so much better than the one we're using. I would rather be frustrated that we're moving too slowly because we want to really consider our options and make a decision we're going to stick with for a long time, then, Oh, yeah, sure. Go make a PR with that and go make a PR with that. And then you have six things. I find that ultimately way more frustrating.

[01:07:21.500] - Robby Russell

You just mentioned how you end up with a situation where there's a new tool that you have to work on and you try it on your technology. You mentioned there's several apps that are different microservices may be built with different programming languages that very few people have experience with. Might have made sense. There's also an interesting thing around how do you give your team enough autonomy and also some space to experiment and do a little bit of research on things. If you don't actually have an applicable, I don't really like throw away test projects personally at my company because I'm always like, well, you learned something, but we didn't really have to put it to the test of actually using it. That we don't really know if we learned that much from that outside of it, that was a fun little side project, but Let's try it on something that we're actually going to put into place and see if that use case works. But then you do give people permission to do that, and then you're stuck having to maintain all these things.

[01:08:09.500] - Hilary Stohs-Krause

Yeah, I think, and this might get me some hate mail, but I don't think that, and this is coming again from the perspective of someone who works with people of all experience levels on a monolith code base. I don't know that I think that work is the best place to do a lot of that experimentation. What I mean by that is I have only a few times seen where some major shift, like let's use an entirely different JavaScript framework, has paid off in the end. I think there are a lot of scenarios where we say, Shipped is better than perfect. I think it takes a lot for there to be enough reason to completely change a system. I have more I've often seen that lead to confusion and inefficiency and frustration than to developer happiness. Obviously, there are exceptions. I think the experimentation that I have seen have payoff is less of replacement and more of addition. So like, site prism. We're not switching from our spec to mini test. We're just saying, Okay, we see where there's a problem with what we're using. Can we bring something else into what we're using to make it better. You're still experimenting and you're still making improvements and things like that. I think I'm for experimentation, but I'm very resistant to overhauling a system without demonstrated value.

[01:09:46.420] - Robby Russell

I can definitely appreciate that. I think I wish more teams thought about this type of thing more often. Because there's that interesting thing where you mentioned being able to have a chance to experiment. I think there's a level of the professional development aspect there. It's an interesting topic and probably a whole another podcast conversation we can get into something like that. It's probably not the podcast for that, but maybe you need to do a bit more experimenting in your own time, maybe, is the key there. If you're curious about learning about this other stuff, and then if you can come back to your team and be like, Hey, I learned how to do something, that's awesome. One thing that comes up often for teams with a lot of history in their code base is pattern drift. We touched on that a little bit in terms of just different text stacks you might use or maybe introducing something like Site Prism. But as an organization, as a team, how do you approach situations where multiple patterns do exist for solving a very similar problem? And how do you start to get people in a similar path forward? You mentioned a little bit of having a custom cops and stuff. Is there more to it than that?

[01:10:43.600] - Hilary Stohs-Krause

I think the first thing really is getting buy-in on what route or pattern you want to follow. And luckily, at least in my experience here at Red Canary, if you just have a pattern that you to use and can explain why, people are generally on board with that, which is great. I gave an example earlier of we structure our model classes all different orders, and I wanted to make one order, and I put a bunch of time, and I made a diagram, and we have delegates and then we have these, and then we have these, and then we... I was like, this is mostly following like RuboCop's example, and then also trying to keep things with like when it doesn't say which one. I had one I'm not sure if you've heard of our... An engineer whose been with the company a long time. I was like, Can you look this over? Does this look good to you? He was like, Yeah, this looks great. You had one suggestion, and then I presented it at an engineering meeting, ask if anyone has comments. Here's where I wrote it up in our documentation. Everyone just gave thumbs up in the chat and no one left a comment, and I was like, Okay, that was pretty easy.

[01:11:50.980] - Hilary Stohs-Krause

Great. I would caution against... Because I did this, I assumed, Oh, people did it in this way for a reason. There was some meaning to them or some purpose for structuring in this way. But I think a lot of times people are either just following what someone else did or they aren't thinking about pattern. It's just like, Oh, well, this scope is maybe similar to when I did, so I'm going to put it here, or I always put new methods at the bottom of the class or I put methods by similar method. I think a lot of engineers have opinions for sure. I think a lot of other engineers just want to know what the pattern is that they should be following and don't necessarily care much which pattern it is as long as there is one. First step would be figuring out why are we using the different patterns we are? Is this deliberate or is this just happenstance? Then what pattern is going to make the most sense for us going forward? And then really, and this is the part that I have found to be the trickiest in my experience with adding new standards, is how do we deal with the backlog. So like Haml Lint was a linter that I added when I came to Red Canary.

[01:13:06.200] - Hilary Stohs-Krause

And we're still working through one of the to-do items, one of the rule sets from the to-do list. We've gotten taken care of everything else. There's one that's just hanging on because, of course, it's the one that's the hardest to change and the most easy to introduce errors and things like that. In my experience, the easiest part is figuring out the process of picking a pattern getting buy-in. That has been the easiest in my experience. The hardest part is the implementation, for sure.

[01:13:35.740] - Robby Russell

It's like one thing for everybody to give a thumbs up to an idea and like, Oh, yeah, and then hope that they'll keep that in mind the next time they encounter a similar scenario where they might need to apply that pattern or standard that you've defined as a team. But then there's also just the slow degradation, unraveling over time or mimicking the other code that exists in your software project. So like you're adding a new method and you're looking at some really big model and some God object in there and you're like, well, there just seems to be a similar methods here. I'll just add another one here. It's not intentional necessarily, because if I am super intentional here, then I probably need to clean up this whole thing, and that's just too big for the task at hand. I'll just take it here. At least it's close to where the other things are. I'm thinking walking into a room like, well, I guess this is the clutter area. We're just going to put stuff over here. It's a challenge for teams to get over that hurdle.

[01:14:28.560] - Hilary Stohs-Krause

Yeah, it's like the junk drawer of code, right? I think really the only thing that I've seen work so far is that any cleanup effort like that just needs a champion. Someone who's going to write the tickets, someone who's going to encourage people to pick them up, someone who's going to pick up, probably a lot of them, theirselves, somebody who's going to try and get buying from a management like, Hey, we've got a week in between sprints or between epics. Can we knock out 10 of these? Some are easier than others because I think there are some where you see The flaky test was one that it was not hard to get people to do because there was an immediate impact to their own work. Some of it is just doing a few as an example so people can see, Oh, wow, this is way better. This is cleaner. This is easier. This is faster, whatever the case might be. Then people get more excited and then they want to contribute. But I think you definitely have to be prepared to do a lot of the work yourself.

[01:15:23.260] - Robby Russell

That goes back to the earlier point around being a maybe a certain skilled developer where you might have the confidence and Do you think it actually... I think I probably know what you're going to say, but for our listeners, do you think you need to be a senior or a staff-level developer to be the champion of those things, or is that something that anyone across any level of a developer could help champion and start making a difference?

[01:15:45.680] - Hilary Stohs-Krause

I think anyone could. I say that with the caveat that there are certainly cultures where you are expected to stay in your lane. You have to make a judgment call based on your experience with your own company. But I think that is a great way to to not only grow your skills and improve your engineering environment, but also to, if you're interested in moving forward, in your career, getting promoted or getting to take on more challenging projects, that can be a great way to build your reputation and capital and just awareness of yourself at work, especially at a larger company. When I first started, we have 100 engineers or whatever, people in engineering. I'm just one person. There was no reason that anyone wanted who I was. But then I became the Haml-Lint person. Then I got really into the idea of accessibility from the development side of things. Then that was something that I became. Just having anything that people are aware of that you can attach your name to, I find that that can be really helpful in terms of your career or even just getting to meet more people in your organization and build some of those relationships.

[01:17:01.400] - Hilary Stohs-Krause

I think it's work, but I think there are definitely a lot of benefits outside of just making the code base better.

[01:17:08.220] - Robby Russell

I full-heartedly agree with that. I think that just seems like such a in retrospect type of thing. I just wish that advice I couldn't, but I would have been able to give people more earlier on in my career. Be the champion, be the person that actually just like, Hey, look, we cleaned up a couple of things. I was able to sneak in a few things in the sprint. Tell people you did it, obviously. If hopefully you're not in a culture where people, you mentioned the stay in your lane or why are you making these changes here? If you're part of a team right now that might ask questions like that, and let's say you're just completely derailed everything else you were supposed to do, and you just spent the three weeks refactoring all your models or something like that and didn't tell anyone that, I would not advise that. But I think figuring out where you can make those incremental improvements. And I think another thing I was curious about with Hillary is, what are some ways that you've seen teams celebrate those small little victories or to keep some momentum going, or at least to start building momentum if they don't...

[01:17:59.700] - Robby Russell

Because I think there's probably people listening. They're like, Sure, great. I can do some stuff. But then I'm like, Well, I don't know if anyone's going to notice or if the other members of my team don't seem to be prioritizing right now, why should I prioritize that? Or if I am motivated, what can I do to get some buy-in or some momentum building it within my team.

[01:18:16.560] - Hilary Stohs-Krause

Yeah. I mean, that can be tricky for sure. Sometimes it does feel like you're just doing it in the void. That it's, This is just for me. This makes me happier or feel better. Maybe that has to be good enough for now. I don't think it's always that way, but that is how I felt sometimes. I write a lot of documentation, and a lot of times it is for me. I'm like, Hillary, in six months, we'll be very glad that she wrote this down. But the other day, I posted some document that I'd updated in a channel. She was like, Hey, FYI, I added more detail to this document on how to do whatever. One of my coworkers responded and was like, Your documentation has literally saved my life multiple times. Obviously not literally saved his life, but I assume he was referring being a call and having incidents, and I had written something. He was like, Great, this is exactly what I need. I think a lot of times that does go unnoticed. It was really great to get that validation that people are reading this. This is useful. I'm not just doing this for my health, but I do think a lot of times it is like silent service, if that makes sense.

[01:19:19.440] - Hilary Stohs-Krause

I think ways to mitigate that, when we were getting rid of coffee script in our code base, the engineer who was championing that attempted to gamify it. He was is tracking, Okay, who's making the most PRs? I think once a week, he would do an update like, All right, this person, she's had six PRs, so get rid of coffee script. She's on top for now. Who's going to rise up and take her spot. He wasn't sure how much that made an impact, to be honest. Kind of unclear. But that is one approach you could take. I think, honestly, this is also just where a lot of people want to help other people. Instead of posting into the main channel or whatever, Hey, I'm doing this thing. Does anyone want to help? Reaching out to people that maybe you've worked with who you know care about similar things and just saying, Hey, do you mind picking up one story from here? I think also making the work as concrete and small as possible. For Haml lint, that was one thing I learned through that process was I made a bunch of stories for all of the to-do items.

[01:20:30.040] - Hilary Stohs-Krause

And about halfway through, I halved all the stories because they were just too big. They were taking too long. I was like, I want these to be bite-sized things. So there's more stories, but it'll go faster because people can pick them up on a Friday afternoon and knock it out. I think making it as easy as possible for people to help you, like voicing out loud gratitude when they do help, bragging about it to other people like, Hey, I want to give a shout out. Jessie, she took on three stories for the Hamlet to do. That was so awesome. I think there's a lot of ways you can do it. A lot of it does, I think, end up being work that's maybe overlooked or taken for granted, which is unfortunate, but I think also I don't know. For me, I just have to remember, I know this is making everything better. While external validation is nice for my team, it also makes me feel good just to know that I am improving things for everyone that I work with.

[01:21:27.420] - Robby Russell

I really appreciate that perspective on that. For those I was listening, maybe a quick thought. If you read some documentation in your code base recently that was particularly helpful, see who modified that, who wrote that, go thank them. If they're around your organization, post a message in Slack or wherever, call them out and be like, Let them know that you read it. You found a value because sometimes we don't want to document things if we don't think anyone's ever going to look at it. But also we document things for our future selves, not just for someone else. I think getting things out of our head is important as well. Hillary, I have a quick question. Is there anything about rails that you wish you had a little bit of extra time to learn more about that you would normally might be a little embarrassed to admit?

[01:22:06.100] - Hilary Stohs-Krause

The first thing that came to my mind with that question was I have a conference friend, Ruby friend, who for a long time worked on Ruby proper at Shopify. I always went to her talks because you must support your friends. I always learned something. But every single one of those talks, about half of it, I'm just sitting there going, I have no idea what she's talking about. And not because she was explaining it poorly or speaking about... She was giving a fantastic presentation. It's just I don't have a computer science degree. I'm not particularly interested, if I'm being honest, in a lot of the internals. But I'd watch her talks and I would think, and you'd hear the room go, Oh, or like, Oh. I was like, I want to know what that... I want to be part of that. I want to have that understanding. So honestly, I think I would love to know more what's going on behind the magic, really.

[01:23:04.680] - Robby Russell

I ask that because I just want to remind the audience that we don't know everything, and we're able to be very productive members of the Ruby on Rael's community. There's a lot of things about lower-level internals that I do not understand as well that aligns with me. I get a lot like, Why am I hosting a podcast talking about Ruby on Rails if I don't understand every little aspect of Ruby on Rounds, I'm like, because I don't have a place to apply that, I think, is part of the reason. We're not even into Shopify level performance type issues. And so I'm glad that they have the infrastructure for that. And those people that are working on those things are working on issues that those types of size organizations have. And a lot of our listeners are probably completely not in the same space as well.

[01:23:44.640] - Hilary Stohs-Krause

When I always tell all early career folks, I'm like, You will never know as much as you don't know about programming. There's just too much to know. And I also encourage people not to try to know everything. I made a decision probably halfway through my programming career that I just don't care about DevOps. I don't find it interesting. I don't have to do it for work very often. I'm just not going to be a very good contributor to that arena, and so I'm not going to stress about it. Whereas some of the things I do care about, code clarity and accessibility for the web, and right, there are plenty of people who know a bunch about DevOps and are not going to know about that, and I think that's great. I think a lot of people don't know internals, but I am looking at my Post-it notes on my monitors and one of the things that I forget every single time, I always have to look it up, is which way the carrot goes when you're comparing date times. I never remember if the past is greater than or less than the future. I never remember, and that's okay.

[01:24:50.920] - Robby Russell

This is what post-it notes are for. I love a good post-it note. All right, a couple of quick last questions for you, Hillary. Is there a technical book that you find yourself recommending to peers?

[01:25:02.840] - Hilary Stohs-Krause

It's not a technical book in the sense that it's about computer technology, but there's a talk that I used to give about how our brains perform risk assessment and how we make decisions and how that applies to our work and coding and everything. One of the books that really influenced that talk is called Thinking in Bets by Annie Duke. Yeah, it came out a while ago, but I just love it. I was talking to a friend the other day who was lamenting a code change that she made that broke a thing on production. We had a bunch of sidekiqs we had to go clean them out. She was like, I'm so sorry. I should have known better. It was like, No, you made the right decision with the information you had at the time. You're resulting. Which is something I learned from this book. Resulting is when we judge a decision based on the outcome rather than the process by which we made the decision. I think that's so useful in all of life, but especially in the workplace, thinking like, Okay, I tried something. I didn't have all the information I needed. It turned out it didn't go the way that I wanted, but that doesn't mean I made the wrong decision. It just means now I know when I have to make a similar decision in the future, these are some of the questions I should ask or these are some of the things I should look into to help me make a better decision. I love that book. I feel like it gave me a lot of grace in how I judged choices that I make.

[01:26:24.220] - Robby Russell

I like that. I'll definitely include links to that, to Thinking in Bets, right? I'll put that in the show notes for everybody as I might be getting confused with another book called maybe Small Bets, so I might have to look this one up and double check.

[01:26:34.480] - Hilary Stohs-Krause

I believe she was getting her PhD in neuroscience, and she dropped out to become a professional gambler.

[01:26:40.340] - Robby Russell

All right. Well, there you go.

[01:26:41.600] - Hilary Stohs-Krause

It's all about how we make decisions when we don't have all the information.

[01:26:45.280] - Robby Russell

Wonderful. Well, Hillary, I really appreciate the way you think about all this. Where's the best place for folks to follow your work or learn more about what you're working on?

[01:26:53.960] - Hilary Stohs-Krause

This is the most boring, basic millennial answer ever, but honestly, probably just LinkedIn.

[01:27:01.300] - Robby Russell

Okay. I'll definitely toss a link to LinkedIn in the show notes for everybody so they can go find you and see what you're ranting about on the internet. Thank you so much for stopping by to talk shop with us, Hillary. Thanks for joining us on OnRails.

[01:27:14.880] - Hilary Stohs-Krause

Thanks for having me.

[01:27:17.420] - Robby Russell

That's it for this episode of OnRails. This podcast is produced by the Rails Foundation with support from its core and contributing members. If you enjoyed the ride, leave a quick review on Apple Podcasts, Spotify or YouTube. It helps more folks find Again, I'm Robby Russell. Thanks for riding along. See you next time.

 



Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.

Maintainable Artwork

Maintainable

Robby Russell
Remote Ruby Artwork

Remote Ruby

Chris Oliver, Andrew Mason
IndieRails Artwork

IndieRails

Jess Brown & Jeremy Smith
REWORK Artwork

REWORK

37signals