How I use LLMs

This article (which I intend to make a living document) is a means to capture my experiences of productively using Claude for the purposes of writing code, and related activities. This article captures my personal experience - I am not preaching it as the only right one, but hope you may find something interesting for yourself here. As it is a living document, I have expanded its title from the original of "My techniques for working with Claude", anticipating future edits. But so far it really is "My techniques for working with Claude".

I have been experimenting with LLMs since the very first ChatGPT (so, probably 2 years or so by now ?), and watching the progress was nothing short of impressive. However, if you expect a rose tinted post how LLMs will replace all the software engineers, this isn't it. My opinion was always that LLMs excel as a "bicycle for the brain", and I still stand by it: you, the human, must still use your brain, even if in a different ways; otherwise, the activity turns into waste of time - even though is quite fun.

This is a collection of approaches which I have accumulated over time which I tend to use. First and foremost, a word about the policy matters: I am lucky enough to have the majority of my work to be open source; so whenever I write "code" here I mean "open source code". For proprietary code, the policy you will use, the tools will be different, and thus the techniques that you apply may be different as well.

My use of Claude is limited to web-based UI and the app. The reason for choosing this approach is partially historical (I started using LLMs before the agents/code plugins were available), but also flexibility - it allows me the most flexibility with all the different scenarios, and is not limited to a more narrow tasks focused purely on code.

As for the techniques, first - a couple of broad techniques which apply to nearly all scenarios.

General principles

The first and foremost rule: "do not force it". as I said, I view LLMs as a bicycle for the mind. And same as with bicycle, it's important to be cognizant of the path you have to take and make judgement calls for when to dismount and walk, rather than forcing it, and risking a fall. When I use LLM to get things done, it has to be an accelerator rather than a drag. So, if I see that after several attempts at a problem, the outputs are wrong or hallucinating, after spending 15-30 minutes with the model without result, I switch to manual mode. The hallucinations or the evasiveness is a means of LLM to signal "I don't know". You can probably put it explicitly into the prompt, but I found it is not always reliable, so the time cap works very well as a safeguard. The boundary is not a hard science, but just to allow for 5-8 iterations.

Second general principle is to be conscious of specificity: if you want to achieve a specific objective, try to be clear what you are looking for. Imagine that you are describing the final result to someone without the context.

Types of tasks that I found the LLMs to be useful with and will talk about:

These use cases apply primarily to Rust code, though I also used it for C. I heard others report good results with Python and JavaScript, I had relatively little exposure with them.

Prototyping entirely new code

This is the simplest and the most obvious approach. Describe what you want to obtain, grab the result and try it out; iterate. I try to minimize the fill of the context window with corrections, so if the output at any steps is not what I want, I go back one step and edit the prompt to steer the model in the direction I want. The key to good results here is being precise from the get go in specifying what you aim to achieve, and to focus on a specific single goal, with constraints - "do FOO; do not do BAR, and BAz".

Generally, the more straightforward the task is, the better the result. Think of this as a very intelligent autocomplete, rather than a replacement for your thinking process.

When you run out of tokens for the turn and need to click "continue", I noticed the in-place editing of the single artefact is often glitchy. Not sure if this is a bug in a front end or fundamental issue; but I found it very useful to explicitly specify in the initial prompt to create a separate artefact with each burst, and then join them later. That said, in general 3-4 bursts is about the maximum for useful code, after that it seems the LLM often "loses focus". However, even 4 bursts is usually 3-4 thousand LOC, so when it works it is a good booster.

A word of caution: expect subtle bugs and be prepared to debug them. This means fully understanding and reading through the code that LLM is generating. This is a bit of a challenge, and is also a signal to switch to end the session: if the code starts to be unclear and buggy, it means you are trying to do too much. If you still want to use LLM, figure out how to split the task into smaller and independent subtasks, and restart. Chances are that one of the parts of work is "hard", and thus you are getting suboptimal result. Needless to say, being very granular upfront is a best option.

Extending the functionality

Overall, I found that extending the functionality is tricky: the LLM has the tendency to rewrite the unrelated parts, sometimes making it worse; thus, version control and review of changes is paramount, and being prepared to switch to manual, possibly using a couple of iteration changes as inspiration for yourself.

It is the best if you can formulate new functionality as a completely independent module, which you can integrate later. Be prepared to refactor heavily in the process. What can often help in minimizing this is uploading the related files and describing what they do high level.

A relatively successful case is a scenario where you need to create similar functionality but in a different scenario, like a protocol dissector. "Please take a look at the attached implementations for protocol X, Y, Z and write an implementation for T" saved quite a lot of time in writing the first cut of the code. If you can give the examples for LLM, always do it. Again, be prepared to refactor and bugfix.

The criterion for a "switch to manual" is about the same as with the new code: trivial compile errors are fine to just "vibe-fix" by posting the error messages; but the logic errors you will have to deal with yourself.

"Stack overflow question" mode

This is a good enough multitool for doing small tweaks, creating examples and templates to start from. It mimics the use of question/answer sites like stack overflow by targeting a single specific issue/uncertainty, rather then a broader task.

Taking the wrong turn due to a wrong answer here can cost you a lot of time - so it is paramount to cross-check the validity of the answers.

A useful approach is asking the same or almost the same question multiple times, and analyzing the results. 5-6 times gives a good idea of some of the possible solutions. Note: the absence of a given solution here doesn't mean that it doesn't exist. If you have a hunch for a given approach but didn't see it in the results, it is useful to explicitly suggest it and see if the model takes a bite. Also a productive technique is to give some knowingly incorrect hints, and to observe the LLM defeat them - sometimes it strengthens the resulting answers.

Exploring the potential solution space

Think of this as an enhanced version of Google search plus pre-prototype, when you know roughly what you want but not sure which tools, libraries and frameworks are available. The results here are purely throwaway and usable for your own education/practice - so, I tend to be fairly flexible; using the results for making searches, reading blogs/articles on the subject and feeding the ideas that pop in my head during reading of those, into the LLM, and using the results again to searches of your own.

My hunch that this "interlacing" in this technique provides the means to avoid local optimums and being "stuck in a rut" for either a human brain or the LLM, which limits the "creative" output.

I see it as a common pattern in different areas, such as rendering music by the music LLMs, generating podcasts by NotebookLM, or writing prose or poetry by the usual LLM suspects, there is a very rigid "personal style" that becomes painfully apparent after the initial "wow" factor of listening to the first 5-6 outputs. Intuitively, this makes sense, and to some extent not unlike the human creative "personality" - however the human brain is in the constant state of adaptation, while the LLM weights are fixed - so the "personal style" pops through much more distinctly and makes the results boring very quickly.

This pattern is a lot subtler when having LLM write code, but given its prevalence across the other domains, I think the "interlacing" technique is an improvement.

Summarization and documentation

The results here are mixed, but given that the task is extremely straightforward, I tend to use it now and then. Usually as a warm-up before digging into the full text, or as a way to document the code that would be completely throwaway otherwise. The key to success here is same as with all the other tasks: keep it simple and manageable size.

Commenting and review

This is a powerful hack that I have read about, and its main power is that human brain is as susceptible to local optimums as the LLMs. When you have written something, you necessarily are tied to the mental space of the work, so it is very easy to miss the critical bits of context that seem obvious to you, but that the reader will not have. Feeding the results to LLM and requesting a review of given aspects gives another perspective, which, while limited compared to a human review from a good reviewer, allows to improve the text at very little cost, and without losing your unique style.

This concludes the list of the techniques, and I hope you found something interesting for yourself in it. I will try to keep this article updated as the time passes - I am sure as the technology evolves, it will have to be adapted with new ideas.

If I were to summarize all this long form for take home messages:

If you have a comment - feel free to make it on BlueSky.