#169 - 4 June 2023

Why I Don’t Like “Strategy”, Part II: Execution First; Plus: Rotate external presentations with the Share-Out; Cognitive load; Data engineering code is software, not duct tape; LLM tools

Hi, everyone!

Sorry for missing the last couple of weeks - after a particularly busy time, we ended up making some last-minute plans to enjoy the holiday weekend last weekend here in Canada and I’m just now catching up.

If one of the last couple of weekends was a long weekend for you, I hope you enjoyed it!

Why I Don’t Like “Strategy”, Part II: Execution First

Last issue I wrote about not being a fan of how our community talks about “strategy” because it’s a catch-all for very different activities, activities which are important enough to be routine parts of the job we’ve taken on rather than something we save for special occasions.

Today I want to talk about another reason I don’t like how we talk about “strategy” - it’s too often used as a way to avoid the quotidian work of running an effective team.

Most Of Our Teams Aren’t Ready For Strategy

When I visit or talk to RCD organizations, the biggest opportunities I see where managers can do some work to see better results are overwhelmingly on the day-to-day functioning of the team.

This is completely understandable! We’re not taught to be managers, and the peer teams we might use as role models are almost invariably somewhat haphazardly managed. We don’t have much to guide us.

Because of that, it’s often here where we can most often profitably spend our time first.

I frequently see issues like quiet frustration that something isn’t being done well, team members not collaborating because of some long-avoided conflict, not being sure who’s doing what, duplicated effort or even working at cross-purposes, lack of effective hand-off between teams or sub-teams, poorly articulated goals and expectations, and (increasingly) poor retention.

If there are routine team-functioning issues, if the team isn’t firing on all cylinders, there’s no point in working on strategy, because:

Strategy Without Execution Is Irrelevant

Any meaningful strategy comes down to choices about what to do and not do. None of that matters if things don’t get done.

The kinds of common problems described above — quiet frustration that something isn’t being done well, team members not collaborating because of some long-avoided conflict, not being sure who’s doing what… — halt forward progress no matter which direction “forward” is.

Teams in this situation team already have some implicit strategy, and it’s not getting executed well. That’s the problem that needs to be addressed first.

You Can’t Fix Execution Problems With Strategy; or, “Have you tried rubbing some Strategy on it?”

I’ve heard the following argument several times, and even made it myself once or twice:

What this team needs is clarity on vision. Once that’s in place, it’ll be much easier to improve execution because everyone will be pulling in the same direction.

It’s a very plausible argument!

But it doesn’t work.

I’ve tried it myself. I’ve watched as others — in some cases much better leaders than I am — tried it.

Maybe it should work. Maybe I and the others were just holding it wrong.

But the only way I’ve seen to successfully deal with problems of routine operations of the team is to face them head on.

The “strategy first” argument is particularly seductive for many of us, because it conveniently postpones the unpleasant interpersonal stuff to some future point after the strategic Big Thinking is done (“Maybe the problem will fix itself once we’ve Set A Strategy!”). And we’re often much more comfortable doing the Big Thinking work.

But if tasks currently aren’t getting done to a level consistent with our expectations, Strategy won’t fix that. If there’s no culture of feedback and recognition currently, Strategy won’t help. If work is being done that doesn’t align with current priorities, there’ll be work that’s at cross-purposes with the new priorities, too.

Get The Team Firing On All Cylinders First

There’s no glory in doing the routine management work our teams need (and deserve). We don’t get to send glossy documents about it around to stakeholders and decision makers.

But it matters. And it’s the job.

It’s not particularly hard! Yes, it’s labour intensive, and slow, and can be kind of stressful at times. But quietly not dealing with the problem is also kind of stressful.

There are some time-tested approaches for dealing with these problems specifically:

Manager 101 level stuff:
- Do we have a good idea of how things are going day-to-day - do we have some practice like one-on-ones?
- Are we regularly giving and soliciting good, effective feedback?
- Do we have some regular practice like quarterly goals for guiding work and professional growth for each of our team members?
- Are we using data from the one-on-ones and quarterly goals to find opportunities to delegate?
Manager 201:
- Are we having healthy disagreements and resolving them?
- Does our team have explicit expectations about team work?
- Are we routinely running effective retrospectives so our team can collectively get better together?
- Are we encouraging peer feedback within the team?
- Is there a library of known-good SOPs/processes for our team (and ourselves?)
Manager 301:
- Are we having developing good relationships with peer teams (perhaps through peer one-on-ones)?
- Are we listening to our researcher clients regularly?

We know how to address these common issues. There’s time-tested approaches, and resources to help. We, and our teams, deserve to have them resolved. And then, thinking about positioning, medium-term prioritization, problem solving, and stakeholder engagement can be valuable.

Are there team issues you see that I haven’t listed here? Any success stories about dealing with them you’d like to share? Any questions you have about any of them? Email me at jonathan@researchcomputingteams.org.

Before we get to the roundup, I really want to hear from you about what you’d like to read more of in this newsletter, what challenges you are facing or you find other managers from the research world are facing, and what tradeoffs you’re considering. I want to make sure this newsletter and occasional resources are as valuable as possible for our community!

You can always email me, but I’d love to have a reader input chat with you if that works better - we could talk about what you’d like to see more (or less!) of, what you think would be most valuable for managers like us, or just ask about things you’re seeing. Feel free to schedule a quick free chat!

And now, on to the roundup!

Managing Individuals and Teams

Across the way at last week’s Manager, Ph.D. I wrote about how to handle an influx of new team members. In the roundup were articles on:

Operational discussions in one-on-ones,
The importance of not skimping on alignment,
Helping introverts in meetings, and
How managing up is essential (and good, actually).

This week I talked about some more general considerations when helping a team adapt to change, and the roundup had articles on:

Finding, coaching, and managing new leads,
Making (and unmaking) hard decisions,
Updating senior executives, and
Networking at conferences.

Technical Leadership

The Share-Out - Jim Savage

Savage describes a couple of nice approaches to getting all the team members aligned and communicating on a new project by having them involved in internal or external discussions; it culminates in his favoured approach, “the Share-Out”:

External stakeholders who will ask good questions are asked to attend a full-team call at the end of each period
For the first sprint, the boss does a short sprint demo/report of the team’s work, and answers questions, followed by a team brief
After the first, a rotating team member owns each sprint/cycle worth of work, including the stakeholders presentation
- They talk to other team members to make sure their work is captured correctly in the new deck
- They do practice presentations
- They handle the presentation & Q&A

I’m a huge fan of rotating between (willing) team members for external presentations - it’s great for deepening knowledge, building intra-team “connective tissue”, and growing the team members. I’ve tried pieces of this before but never this systematically - I can imagine it working extremely well.

Do you have team members responsible for big-deal external presentations? What do you find works and doesn’t work? Let me know at jonathan@researchcomputingteams.org.

Research Software Development

Clever Code Considered Harmful - Josh Comeau
Cognitive Load Developer’s Handbook - Artem Zakirullin

Comeau references Kernigan’s old line, ““Everyone knows that debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it?”.

For us, though, it’s often even worse than that; we’re frequently going to be handing off our code to non-dev researchers. if we’re as clever as we can be coding, what chance do the researchers have of debugging?

Comeau’s article strongly advocates for preferring readability over cleverness, and using average developers or even interns as the benchmark for readability. This can be enforced with code reviews (or for some things, linting and metrics).

Zakirullin talks about cognitive load - not just complexity or cleverness, although that’s part of it, but also just how spread out is the knowledge one needs to understand this code, how many facts does one have to have in one’s brain at once.

Readability and understandability are vital considerations for research software. Our communities are rightly strongly in favour of open science, of using open source code for science so the methods are transparent and reproducible. But if the code is impenetrable, then we’re throwing away some of that transparency.

A referee would, rightly, bounce a manuscript back for having a Methods section that’s “cleverly” written but requires hours upon hours of tracing through to understand. Referees have standards for readability, and if we’re serious about our software being a research output like a paper (or a research input many people will use and contribute to), we should have the same standards.

Research Data Management and Analysis

From Chaos to Collaboration: 5 “Do’s and Don’ts” for Data Engineers Working in Teams - Eden Bar-Tov

Like bash scripts for sysadmin tasks, we tend to think of data engineering scripts and pipelines as one-offs, to be written out and forgotten about. But also like bash scripts for sysadmin tasks, they tend to last a long time and eventually become load-bearing.

Bar-Tov shares some of his advice for data engineers, which largely apply what we’ve learned in the last decades of software development to data pipelines:

Share design plans and alternatives before committing
Invest in quality code review - for knowledge transfer and a second pair of eyes
Pay attention to data quality
Write code with an aim to being generic an reusable (within reason)
Allocate tasks with growth and interests in mind

Speaking of data quality, data wrangler is a new tool from Microsoft (with VSCode integration) which looks pretty cool, as does datalab (a “data linter”).

This also looks useful - thebe adds Jupyter cells to otherwise static webpages, connecting to a jupyter lab server or, with thebe-lite, running it in jupyter-lite.

Emerging Technologies and Practices

My prediction (disclaimer, my day job is at NVIDIA) is that deep learning frameworks and models, and especially those for generative AI models like LLMs, are quickly going to become, from our point of view as people who support research, something akin to (say) linear algebra libraries for our researchers.

If only because deep learning itself uses linear algebra it won’t be as ubiquitous as linear algebra libraries. However, I think this is a useful mental model. Because there’ll be:

A lot of need for these packages, frameworks, and models,
A wide variety of implementations that have different capabilities and tradeoffs,
Frequent nudges from us towards them using existing stuff rather than rolling their own (how many triple-loop matrix multiplies have I seen in my day), and
Only so much need for developers of deep learning frameworks or foundation models, but lots of and growing need for development around them, to make use of them effectively.

Our data science/engineering/management teams will be closer to this than others, but if I’m right many of even our software and systems teams will have to have some familiarity with some combination of:

The frameworks and packages available,
The models available,
A number of ways to effectively fine-tune models, on-node and multi-node,
Tools for assessing model performance,
Optimizing models for inference for size and performance - quantization, pruning..,
Deploying model inference services efficiently and effectively, and
Tools and approaches for building tools and mini-applications around the inference

Eventually I’d like to put together a curriculum for a crash-course for research computing & software teams about effectively supporting use of these tools. Does that sound useful? What does your team need, and what has it found useful? I’d love to hear - just email me at jonathan@researchcomputingteams.org and if there’s demand I’ll try to put something together.

For now things I’ll be doing report some useful looking tools, tutorials, and techniques. In the past couple of weeks some things that caught my eye:

The GitHub Copilot team’s experience with prompts and the evolution of their models
Other hard stuff about deploying and using LLMs as part of a tool, from Honeycomb
An open and self-hostable code LLM, starcoder (created by an interesting collaboration between HuggingFace and Service Now)
Nice tutorial for getting started with LangChain, which supports a range of range of commercial or hostable LLMs - this is what I’d recommend to start playing with LLMs as part of a tool or pipeline
Specifically for chatbots (e.g. for serving queries about large volumes of documentation) OpenChatBot seems promising
ChainForge, an IDE for testing out LLM prompts
Numbers every LLM developer should know - a small but growing set of numbers to help build intuition about deploying LLMs

Random

Learn how https works by watching this website fetch itself, byte by byte, over TLS.

Computational problems can be far, far harder than NP-complete.

Deep dive into dm-verity, file system integrity for embedded linux (in the kernel since 3.4).

Super handy MacOS utility I had never heard about - networkQuality (and a server you can stand up if you want to test particular links).

This is cool - a clustered Map Of GitHub, by the person who brought you Map Of Reddit.

Finally, a handheld 386 gaming pc.

MS Paint finally gets a dark mode.

Prime finding using find(1).

A formal model of x86 instructions in Z3.

.bashrc for python REPLs - PYTHONSTARTUP.

Web Assembly is going to be more composable with POSIX programs - TCP, pthreads, pipes… - with WASIX.

Outputting SVG images directly from Postgres.

IP6oS3 - you know, IP6 over S3, like you do.

Bioinformaticians are finding ChatGPT useful for automating some tasks.

That’s it…

And that’s it for another week. Let me know what you thought, or if you have anything you’d like to share about the newsletter or management. Just email me or reply to this newsletter if you get it in your inbox.

Have a great weekend, and good luck in the coming week with your research computing team,

Jonathan

Research computing - the intertwined streams of software development, systems, data management and analysis - is much more than technology. It’s teams, it’s communities, it’s product management - it’s people. It’s also one of the most important ways we can be supporting science, scholarship, and R&D today.

So research computing teams are too important to research to be managed poorly. But no one teaches us how to be effective managers and leaders in academia. We have an advantage, though - working in research collaborations have taught us the advanced management skills, but not the basics.

This newsletter focusses on providing new and experienced research computing and data managers the tools they need to be good managers without the stress, and to help their teams achieve great results and grow their careers.