Prompting Techniques

Prompt Engineering

Mick McQuaid

University of Texas at Austin

23 Sep 2025

Week Five

Agenda

  • Presentations: Xiaoqi, Zhizhou
  • News
  • Review whatiknow (Dhruvi)
  • eB review
  • eC preview
  • m1 questions
  • Finish previous chatbot
  • Introduce Google AI Studio
  • Techniques

Presentations

News

s1

  • From simplescaling
  • Trained on 1,000 examples
  • Each example is \langle a question \rangle \langle an answer \rangle \langle a reasoning process \rangle
  • First tried 59,000 examples

The Batch and Data Points

\langle pause to look at last week’s edition \rangle

WhatIKnow (Dhruvi)

\langle pause to discuss contributions \rangle

Writing and formatting the doc

  1. Write about topics that excite you. If it greatly interests you, it’s more likely to greatly interest others in the class.
  2. Sign your contributions at the end of the contribution. You can either write your name or use a smart chip with your google identity.
  3. Add a horizontal rule before and after your contribution. (Only after if there is already one before!)
  4. Use headings and subheadings in your contribution.
  5. Use links in your contribution. Use Links from the Insert menu.
  6. Consolidate tabs—no reason for three tabs for Evaluation. Either delete the second and third tab (especially the stuff about articles that were assigned readings) or move their contents into the first tab. Move the content from the Prompt Optimization tab into the Prompt Techniques tab, which already has a contribution about Prompt Optimization anyway.

eB review

Smart strategies

  • adjust the temperature (needs API)
  • adjust the max_tokens (needs API)
  • do it piecemeal
  • use more than one LLM
  • think about the goal more than constraints

eC preview

\langle look at the doc \rangle

m1 questions

Project levels

There are two groups of students in the class: novices (you’ve just met the requirement of one Python course), and advanced (you have significant development experience)

It is vital that the course meets the needs of both groups, which is not easy.

To achieve this, one necessity is to have varying levels of projects. Broadly there are two levels, novice and advanced. Both are eligible for an A grade. You should work comfortably at your limit, not shirking but not trying to be heroic.

You should deploy something you can put in your portfolio and can be anything from a simple chatbot like we created using chatlas or an agentic app using, say, Google’s ADK. Include extensive use of prompting and make it possible to evaluate and compare prompts.

Deliverable

  • A short qmd / html document describing the domain
  • The doc should include specification of model(s)
  • The doc should include a discussion of the possible tools and or datasets you may use (actual tools and or datasets are due in m2)
  • You are not required to stick with the directions you give here, but this should be your current best guess of what you plan to do
  • Examples: chatbot to emulate a foreign leader; chatbot to triage banking problems; chatbot to analyze tweets; note that you can do other genAI tasks that require prompt engineering, such as image generation, as long as a conversation is involved
  • Note that I’m expecting to see two levels of projects: novice (you’ve just met the requirement of one Python course), or advanced (you have signficant development experience)

Simple chatbot revisited

We’ll create a chatbot about the famous Titanic dataset

pip install shiny
shiny create --template querychat --github posit-dev/py-shiny-templates/gen-ai

Notes

The requirements.txt file contains some spurious code. Delete part about the python-package.

The app.py file is missing the load_env() code. Add it after line 5.

from dotenv import load_dotenv
load_dotenv()

Otherwise, follow the onscreen instructions.

Google AI Studio Intro

We’ll start by doing the same thing in this environment that we did with chatlas, create a chatbot that offers expense policy advice.

Techniques

According to Schulhoff et al. (2024)

tree of techniques

Important Note

There is no substitute for reading Schulhoff et al. (2024)! I’m just listing the main concepts here. I’ll ask you to pick one and explain it in your own words.

Top level

  • Zero-Shot
  • Few-Shot
  • Thought Generation
  • Ensembling
  • Self-Criticism
  • Decomposition

Few-Shot Design Decisions

  • Exemplar Quantity: as many as possible
  • Exemplar Ordering: randomly order them
  • Exemplar Label Distribution: balance the distribution
  • Exemplar Label Quality: ensure correct labeling
  • Exemplar Format: use a common format
  • Exemplar Similarity: select similar examples to the test instance

Few-Shot Techniques

  • difficult to implement
  • K-Nearest Neighbors
  • Vote-K
  • Self-Generated In-Context Learning
  • Prompt Mining
  • Complicated Techniques use iterative filtering, embedding and retrieval, and reinforcement learning

Zero-Shot Techniques

  • use no exemplars
  • Role Prompting
  • Style Prompting
  • Emotion Prompting
  • System 2 Attention
  • SimToM
  • Rephrase and Respond
  • Re-reading
  • Self-Ask

Thought Generation

  • prompting the model to articulate its ongoing reasoning
  • Chain-of-Thought
  • Zero-Shot Chain-of-Thought
  • Step-Back Prompting
  • Analogical Prompting
  • Thread-of-Thought Prompting
  • Tabular Chain-of-Thought

Few-Shot CoT

  • multiple examples, including chains-of-thought
  • Contrastive CoT Prompting
  • Uncertainty-Routed CoT Prompting
  • Complexity-based Prompting
  • Active Prompting
  • Memory-of-Thought Prompting
  • Automatic Chain-of-Thought Prompting

Decomposition

  • explicitly decomposing the problem into subproblems
  • Least-to-Most Prompting
  • Plan-and-Solve Prompting
  • Tree-of-Thought Prompting
  • Recursion-of-Thought Prompting
  • Program-of-Thoughts
  • Faithful Chain-of-Thought
  • Skeleton-of-Thought
  • Metacognitive Prompting

Ensembling

  • using multiple prompts to solve the same problem, then aggregating the results, for example, by majority vote
  • Demonstration Ensembling
  • Mixture of Reasoning Experts
  • Max Mutual Information Method
  • Self-Consistency
  • Universal Self-Consistency
  • Meta-Reasoning over Multiple CoTs
  • DiVeRSe
  • Consistency-based Self-Adaptive Prompting
  • Universal Self-Adaptive Prompting
  • Prompt Paraphrasing

Self-Criticism

  • prompting the model to critique its own output
  • Self-Calibration
  • Self-Refine
  • Reversing Chain-of-Thought
  • Self-Verification
  • Chain-of-Verification
  • Cumulative Reasoning

Exercise

Run the two readings, Sahoo et al. (2024) and Schulhoff et al. (2024), through NotebookLM. Ask it to summarize them and then ask about the discrepancy between the F1 scores in Schulhoff’s case study. (Manual had high precision and low recall, while automated had the reverse.)

Then consider the diagram on the following screen, showing graphically the definitions of precision and recall (F1 is the harmonic mean of these two statistics). Comment on your view of how NotebookLM has described the difference between the F1 scores.

END

References

Sahoo, Pranab, Ayush Kumar Singh, Sriparna Saha, Vinija Jain, Samrat Mondal, and Aman Chadha. 2024. “A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications.” https://arxiv.org/abs/2402.07927.
Schulhoff, Sander, Michael Ilie, Nishant Balepur, Konstantine Kahadze, Amanda Liu, Chenglei Si, Yinheng Li, et al. 2024. “The Prompt Report: A Systematic Survey of Prompting Techniques.” https://arxiv.org/abs/2406.06608.
Wilkinson, Leland. 2005. The Grammar of Graphics (Statistics and Computing). Secaucus, NJ, USA: Springer-Verlag.

Colophon

This slideshow was produced using quarto

Fonts are Roboto Light, Roboto Bold, and Victor Mono Nerd Font