学学怎么写Prompt

Yaoxuan Wei

2023-06-09 (Updated: 2023-06-09)

LLM, Prompt Engineering

咒语课开课啦, 本文的参考的教程是 -> 传送门.

这篇文章纯属自嗨建议去看原文

Prompt Engineering

Prompt 就是大家所谓的咒语, 咒语写好了才可以发挥出模型真正的实力. 让我们开始吧. (声明: 除实例外, 本文不由 LLM 生成)

TT, 但不是 tiktok

首先的首先, 让我们来看两个概念:

Temperature
Top_p

Temperature

简单的说, Temperature这个值越低, 结果就越确定. (选择下一个最可能的 token) 增大这个值就会导致更多的随机, 从而使得输出的结果会变得更 diverse. (因为可能的 token 的权重都增加了) 更加具体的说: 如果我们的应用场景是基于事实的 Q&A, 那我肯定是希望调低这个参数. 反之, 如果我希望模型能够输出一些文学创作的内容, 那就调高一点.

Top_p

和Temperature类似, 这个值同样也是决定输出结果的. 不同的是, 它控制结果的方式是通过采样范围, 例如, 0.1则表示我只会在前 10%的可能性中选择下一个 token. 所以你知道了, 这个值越大, 输出的结果自然也就越”随机”. 反之输出的结果就会越确定.

需要注意的是, 这个参数的值是跟模型相关联的, 同样的参数在不同的模型上表现是不同的. 另外, 这两个参数都是可以独立调整的.

对于 GPT-3, 有这样的一些参考组合:

Use Case	Temperature	Top_p
Code 生成	0.2	0.1
文学创作	0.7	0.8
聊天机器人	0.5	0.5
数据分析	0.2	0.2

接下来我们就可以正式进入咒语的学习了. 你可以打开一个 OpenAI 的游乐场玩一玩.

接下来我会使用

Prompt

Output

来表示我这边游乐场的输入输出. 使用的模型是text-davinci-003, temperature=0.7 && top_p=1.

Prompt

1	The sky is

Output

1
2
3

The sky is blue

The sky is usually blue during the day, due to the atmosphere of the Earth scattering blue light from the sun more than other colors. At night, the sky can appear to be black due to the lack of sunlight in the atmosphere.

说了一大堆是不是, 这样就很有可能会超出你原本想要的内容. 如果我们这样修改一下:

Prompt

1 2	Complete the sentence: The sky is

Output

1
2
3

Complete the sentence:
The sky is ______
blue.

看, 这样输出的结果就好多了. 因为我们给了明确的指令 (需要做什么事? -> 补完句子)
除此之外, 你可能还留意到 GPT 输出了一条线. 这告诉我们, 我们的咒语其实是讲究格式的, 测试一下:

Prompt

1	The sky is ______

Output

1
2
3

The sky is ______

blue

看, 这一次我们没有给出文本指令, 但是通过一条线 (格式) 也表达了我们的诉求. 再举一个例子:

Prompt

This is awesome! // Positive
This is bad! // Negative
Wow that movie was rad! // Positive
What a horrible show! //

Output

This is awesome! // Positive
This is bad! // Negative
Wow that movie was rad! // Positive
What a horrible show! // Negative

Prompt 的组成

从上面的几个例子我们能总结出来一个好的 Prompt 会包含:

Instruction - 明确的指令
Input Data - 我们的输入

除此之外, 为了更好的获得结果, 我们通常还可以加上:

Context - 帮助模型能更好的返回结果的辅助信息
Output Indicator - 返回结果的格式

当然跟模型交互也没让你一定要带上这些, 你就算只写个字母A上去, 模型都可以热情的帮你生成一道 multiple choice (笑)

Prompting Techniques

好啦, 简单的 Prompt 我们已经了解了, 那么接下来. 大的要来了.

Zero-shot Promoting

Zero shot 简单的说就是我们不需要教模型做事, 直接发号施令就可以了. 回过头看一下我们在上节用到的情绪例子, 我们向模型提供了三个例子, 接着才问出我们想问的问题. 但实际上很多任务其实并不需要我们向模型解释 – 它们早就会了!

请看:

Prompt

1
2
3

Classify the text into neutral, negative or positive.
Text: What a horrible show!
Sentiment:

Output

1
2
3

Classify the text into neutral, negative or positive.
Text: What a horrible show!
Sentiment:  Negative

当然你可以尝试修改 Text 的内容, 你会发现, 有的时候好像结果也不是这么可靠. 那么这个时候就需要Few-Shot Prompting登场了

Few-Shot Prompting

是的, 我们在上面已经使用过这种咒语了. 简单的说, 如果完全没有铺垫, 很多时候模型的表现还是不尽如人意, 因为我们需要简单的提供一下上下文和用例. 这里的Few, 可以是1-shot, 也可以是3-shot, 5-shot更多…取决于我们的任务复杂度.

我们先来使用这样的一个例子来尝尝鲜:

Prompt

A "walao" is a cute, small animal native to Singapore. An example of a sentence that uses the word walao is:
Walao wants to beat the shit out of you!

Now, to do a "qazwsxedc" means to clap your hands. An example of a sentence that uses the word qazwsxedc is:

Output

A "walao" is a cute, small animal native to Singapore. An example of a sentence that uses the word walao is:
Walao wants to beat the shit out of you!

Now, to do a "qazwsxedc" means to clap your hands. An example of a sentence that uses the word qazwsxedc is:
Qazwsxedc and cheer to show your support!

Ok, 模型完成了任务.
那么接下来, 我们来看看使用Few-shot Prompting的一些 tips:

示例的格式对结果会有神奇的影响
随机标签有的时候会有神奇效果 (标签可以理解为我们上面例子中的答案)

我用了神奇, 因为有很多时候, 模型生成的结果真的很…神奇, 来直接动手看看:

Prompt

This is awesome! // Negative
This is bad! // Positive
Wow that movie was rad! // Positive
What a horrible show! //

Output

This is awesome! // Negative
This is bad! // Positive
Wow that movie was rad! // Positive
What a horrible show! // Negative

我们随机了标签 (Negative 和 Positive), 但模型 somehow 还是给出了正确的答案, 另外实验表明, 我们使用随机格式会使得 GPT 输出鲁棒性更好的答案:

Prompt

Positive This is awesome!
This is bad! Negative
Wow that movie was rad!
Positive
What a horrible show! --

Output

Positive This is awesome!
This is bad! Negative
Wow that movie was rad!
Positive
What a horrible show! -- Negative

当然, Few-shot Prompting也并不是万能的, 例如下面的例子:

Prompt

1 2	The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1. A:

Output

1 2	The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1. A: No, the odd numbers in this group add up to an odd number (122).

回答的内容只能说毫无关联, 并且还是错误的. 那么我们能否通过few-shot Prompt来优化它呢?

Prompt

The odd numbers in this group add up to an even number: 4, 8, 9, 15, 12, 2, 1.
A: The answer is False.
The odd numbers in this group add up to an even number: 17,  10, 19, 4, 8, 12, 24.
A: The answer is True.
The odd numbers in this group add up to an even number: 16,  11, 14, 4, 8, 13, 24.
A: The answer is True.
The odd numbers in this group add up to an even number: 17,  9, 10, 12, 13, 4, 2.
A: The answer is False.
The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1.
A:

Output

The odd numbers in this group add up to an even number: 4, 8, 9, 15, 12, 2, 1.
A: The answer is False.
The odd numbers in this group add up to an even number: 17,  10, 19, 4, 8, 12, 24.
A: The answer is True.
The odd numbers in this group add up to an even number: 16,  11, 14, 4, 8, 13, 24.
A: The answer is True.
The odd numbers in this group add up to an even number: 17,  9, 10, 12, 13, 4, 2.
A: The answer is False.
The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1.
A: The answer is True.

Hmmmm, 还是错误的. 这说明few-shot prompting对于这种问题的帮助还是有局限的. 尽管我们可以得到看上去正确的答案, 但并不reliable. 那是因为, 我们的 label 跨度太大了, 也就是所谓的易得, 即. 我们在看到这些东西的时候, 思维并不是直接到位的, 而是经过了一系列的思考过程. 这个思考过程也就是所谓的chain of thought.

更通俗的说, 我们要向模型解释这些答案的来由 (思考的过程), 而不是直接粗暴的写最终结果.

Chain-of-Thought (COT) Prompting

就像上文解释的那样, 我们需要向模型解释步骤. 甚至在明确了这一点之后, 我们只需要1-shot就可以解决问题, 请看例子:

Prompt

The odd numbers in this group add up to an even number: 4, 8, 9, 15, 12, 2, 1.
A: Adding all the odd numbers (9, 15, 1) gives 25. The answer is False.
The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1.
A:

Output

The odd numbers in this group add up to an even number: 4, 8, 9, 15, 12, 2, 1.
A: Adding all the odd numbers (9, 15, 1) gives 25. The answer is False.
The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1.
A: Adding all the odd numbers (15, 5, 13, 7, 1) gives 41. The answer is False.

这回终于对了.

那么此时一个大胆的想法产生了, 我能否将这个思考过程直接丢给模型呢?

没错, 这就是Zero-shot COT.

Prompt

1 2	I went to the market and bought 10 apples. I gave 2 apples to the neighbor and 2 to the repairman. I then went and bought 5 more apples and ate 1. How many apples did I remain with?

Output

1
2
3

I went to the market and bought 10 apples. I gave 2 apples to the neighbor and 2 to the repairman. I then went and bought 5 more apples and ate 1. How many apples did I remain with?

I remained with 12 apples.

错了! 我们只是单纯的问了问题, 接下来是魔法时间.

Prompt

1
2

I went to the market and bought 10 apples. I gave 2 apples to the neighbor and 2 to the repairman. I then went and bought 5 more apples and ate 1. How many apples did I remain with?
You need to think step by step.

Output

I went to the market and bought 10 apples. I gave 2 apples to the neighbor and 2 to the repairman. I then went and bought 5 more apples and ate 1. How many apples did I remain with?
You need to think step by step.

First, you had 10 apples. Then you gave away 4 apples, so you were left with 6 apples.
Then you bought 5 more apples, so now you have 11 apples.
Then you ate 1 apple, so you are left with 10 apples.

看! 我们只是加了一句话, 但是模型输出的内容发生了巨大的改变, 不仅如此, 我们也获得了正确的答案.

基于 COT, 学术上还有很多 Prompting Techniques, 例如, ReAct, Active-Prompt, Self-Consistency, Tree-of-thoughts, etc.

Tips && Best Practises

最后再让我们总结下一些实用的玩意儿.

想要明确一些指令(Instruction)的时候, 请使用###, 或者"""来分开指令和内容(Context). 这也就是我们前文提到的格式 (format)

Prompt

1
2
3

### Instruction ###
Translate the text below to Spanish:
Text: "hello!"

Output

### Instruction ###
Translate the text below to Spanish:
Text: "hello!"

¡Hola!

特别的, 当进行代码生成的时候, 注意多使用leading words (e.g: import).

另外, 我们应该尽可能的明确, 少用few这种模糊的词.

最后, 少说DO NOT, 尽可能的说what TO DO. 例如:

Prompt

1
2
3

The following is an agent that recommends movies to a customer. DO NOT ASK FOR INTERESTS. DO NOT ASK FOR PERSONAL INFORMATION.
Customer: Please recommend a movie based on my interests.
Agent:

Output

The following is an agent that recommends movies to a customer. DO NOT ASK FOR INTERESTS. DO NOT ASK FOR PERSONAL INFORMATION.
Customer: Please recommend a movie based on my interests.
Agent:
What kind of movie are you looking for? Do you prefer action, comedy, romance, or something else?

如果我们修改成:

Prompt

1
2
3

The following is an agent that recommends movies to a customer. The agent is responsible to recommend a movie from the top global trending movies. It should refrain from asking users for their preferences and avoid asking for personal information. If the agent doesn't have a movie to recommend, it should respond "Sorry, couldn't find a movie to recommend today.".
Customer: Please recommend a movie based on my interests.
Agent:

Output

1
2
3

The following is an agent that recommends movies to a customer. The agent is responsible to recommend a movie from the top global trending movies. It should refrain from asking users for their preferences and avoid asking for personal information. If the agent doesn't have a movie to recommend, it should respond "Sorry, couldn't find a movie to recommend today.".
Customer: Please recommend a movie based on my interests.
Agent: Sorry, I'm unable to recommend a movie based on your interests. However, I can recommend one of the top global trending movies. How about 'Joker'?