LLM Code Generation Guidelines

A collection of practical guides and system prompts for using LLMs effectively in software and analysis development.

Overview

This repository contains guidelines for working with Large Language Models (LLMs) to generate code. The focus is on maintaining human control, creating sustainable workflows, and producing quality code through clear communication with AI assistants.

Core Philosophy: You stay in charge of design and decisions. The LLM implements what you specify.

The Hare and The Tortoise: LLMs are great servants and bad masters. If you expect to rush ahead and expect it to generate the whole of whatever you're creating in one go, you will end up with more code than you can follow/read/understand. In that mass there will be somewhere an error. This error will not be obvious to you, nor will it be to the LLM that generated it. What will follow is a slow and painful debugging process in which your early, apparent speed gains are lost incrementally. The quickest path to success is to use small, well specified steps very steadily.

The repo contains guides and examples for using LLMs for analysis code and workflow generation.

Questions These Materials Answer

How do I get concise code instead of explanations?
How do I maintain context across multiple sessions?
How do I keep the LLM from suggesting things I don't want?
How do I integrate LLM work with git workflow?
How do I help the LLM understand my existing codebase?
How do I prevent the LLM from making design decisions?

LLM-Code-Generation-Guide.md - Comprehensive best practices guide
System-Prompt-R-Analysis.md - System prompt for R/RStudio work
System-Prompt-Python-Package.md - System prompt for Python package development
System-Prompt-Python-Script.md - System prompt for Python scripting
System-Prompt-Snakemake-Project.md - System prompt for Snakemake workflows base on blank_snake

Getting Started

If You're New to LLM-Assisted Development

Start here: Read sections 1-3 of the main guide 01-LLM-Code-Generation.md:

Writing Effective Prompts
Controlling Output Verbosity
Engineering Process: Plan First

Try this: Practice with a simple task using the prompt templates. See how specific constraints improve results.

Key insight: Vague prompts get verbose, generic responses. Specific prompts with constraints get usable code.

If You Want Structured Workflows

Read: 4. Understanding Existing Codebases 5. The TODO_TREE.md System 6. Git Workflow Integration

Try this: Create a TODO_TREE.md for a current project. Use it in your next LLM session.

Key insight: LLMs lack persistent memory. A work tree gives them context across sessions.

If You Want Domain-Specific Prompts

Use the system prompts:

R users: Statistical analysis and RStudio work → System-Prompt-R-Analysis.md
Python package developers: Building libraries → System-Prompt-Python-Package.md
Python scripters: Command-line tools → System-Prompt-Python-Script.md

Try this: Copy the relevant system prompt into your LLM's custom instructions or paste it at the start of sessions.

Key insight: System prompts set boundaries. They prevent LLMs from "helping" in ways you don't want.

Quick Examples

Getting Concise Code

Instead of:

"Can you help me write a function to validate email addresses?"

Write:

Write a Python function to validate email with regex.

Requirements:
- Accept string, return bool
- Check format: user@domain.tld
- Include type hints and docstring

Output: Code only.

Using the TODO Tree

Create tree:

## 1. Setup [>]
  ├─ 1.1 Database [✓]
  └─ 1.2 API structure [>] ← CURRENT
      ├─ 1.2.1 Routes [ ]
      └─ 1.2.2 Controllers [ ]

Prompt LLM:

TODO tree: [paste above]

I'm on task 1.2.1. Implement Express routes for user CRUD.

After completing:

Done with 1.2.1. Code: [paste]

Update tree: mark 1.2.1 done, move to 1.2.2.

Maintaining Control (R Example)

Wrong approach:

"I have survey data. What analysis should I run?"

Right approach:

"Generate R code to run linear regression of satisfaction on age + income.
Use lm(). Output: just the code block."

The first invites unwanted suggestions. The second gets you code for your decision.

Key Concepts Explained

1. Explicit Constraints

Tell the LLM exactly what you want and don't want. "Code only" prevents explanations. "Use only pandas" prevents alternative approaches.

2. Progressive Context Building

Don't dump entire codebases. Share directory structure, then configs, then relevant files. Layer by layer.

3. TODO Trees as Memory

Create a hierarchical task list (TODO_TREE.md). Update it as you work. Share it at the start of each session. The tree becomes the project's persistent memory.

4. Atomic Commits

One task = one commit = one tree update. Your git history mirrors your task tree.

5. Separation of Concerns

Design and analysis decisions: yours. Code implementation: LLM's. Keep this boundary clear.

Common Pitfalls

❌ Asking the LLM to make decisions:

"Should I use REST or GraphQL?"
"What statistical test is appropriate?"
"How should I structure this?"

✅ Having LLM implement your decisions:

"Implement REST endpoints for [specification]"
"Generate code for t-test comparing groups A and B"
"Create [specific structure] following this pattern"

❌ Overwhelming with context:

Pasting entire 5000-line files
Sharing unrelated code
No structure to information

✅ Targeted context:

Relevant files only
Directory tree for structure
Specific sections of large files

❌ Letting output run wild:

Open-ended questions
No format specifications
Accepting verbose explanations

✅ Constraining output:

"Code only"
"Format: [specify structure]"
"No explanations unless asked"

Workflow Summary

1. Plan task → Add to TODO_TREE.md
2. Share tree with LLM → Get context restoration
3. Request implementation → Be specific
4. Review code → Test it
5. Update tree → Mark done
6. Commit code + tree → Single atomic commit
7. Repeat

Adapting These Guidelines

These materials are templates. Adapt them:

For your team: Add your conventions, tech stack, processes
For your domain: R prompts emphasize statistics; yours might emphasize embedded systems
For your style: Prefer base R over tidyverse? Update the system prompt
For your workflow: Use different status markers? Change the TODO_TREE legend

The principles remain: clarity, constraints, context, control.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
01-LLM-Code-Generation-Guide.md		01-LLM-Code-Generation-Guide.md
01-LLM-Code-Generation-Guide.qmd		01-LLM-Code-Generation-Guide.qmd
02-System-Prompt-R-Analysis.md		02-System-Prompt-R-Analysis.md
02-System-Prompt-R-Analysis.qmd		02-System-Prompt-R-Analysis.qmd
03-System-Prompt-Python-Package.md		03-System-Prompt-Python-Package.md
03-System-Prompt-Python-Package.qmd		03-System-Prompt-Python-Package.qmd
04-System-Prompt-Python-Script.md		04-System-Prompt-Python-Script.md
04-System-Prompt-Python-Script.qmd		04-System-Prompt-Python-Script.qmd
04-System-Prompt-Snakemake-Project.md		04-System-Prompt-Snakemake-Project.md
04-System-Prompt-Snakemake-Project.qmd		04-System-Prompt-Snakemake-Project.qmd
CLAUDE.md		CLAUDE.md
README.md		README.md
_quarto.yml		_quarto.yml
about.qmd		about.qmd
index.qmd		index.qmd
styles.css		styles.css

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Code Generation Guidelines

Overview

Questions These Materials Answer

Contents

Getting Started

If You're New to LLM-Assisted Development

If You Want Structured Workflows

If You Want Domain-Specific Prompts

Quick Examples

Getting Concise Code

Using the TODO Tree

Maintaining Control (R Example)

Key Concepts Explained

1. Explicit Constraints

2. Progressive Context Building

3. TODO Trees as Memory

4. Atomic Commits

5. Separation of Concerns

Common Pitfalls

Workflow Summary

Adapting These Guidelines

About

Uh oh!

Releases

Packages

Languages

TeamMacLean/system_prompting

Folders and files

Latest commit

History

Repository files navigation

LLM Code Generation Guidelines

Overview

Questions These Materials Answer

Contents

Getting Started

If You're New to LLM-Assisted Development

If You Want Structured Workflows

If You Want Domain-Specific Prompts

Quick Examples

Getting Concise Code

Using the TODO Tree

Maintaining Control (R Example)

Key Concepts Explained

1. Explicit Constraints

2. Progressive Context Building

3. TODO Trees as Memory

4. Atomic Commits

5. Separation of Concerns

Common Pitfalls

Workflow Summary

Adapting These Guidelines

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages