4. How LLMs work - GPT architecture
Input processing
Input
Output
Decoder
Contextual and Session Handling
Safety and Content Filters
Bootstrap prompts
5. How LLMs work - Tokenisation
Swift is a powerful and intuitive programming language for all Apple platforms.
Swift is a powerful and intuitive programming language for all Apple platforms.
6. How LLMs work - Token Embeddings
-0.05,
0.017
…
-0.01
Vectors with 12288 dimensions (features) and positional encoding
-0.03,
0.118
…
-0.02
-0.01,
0.007
…
-0.004
…
Swift is a powerful and intuitive programming language for all Apple platforms.
7. How LLMs work - Token Embeddings
This vector encapsulates the semantic meaning of Google, without the
semantics of the word Kotlin, and also includes the semantics of the
word Swift.
Google - Kotlin + Swift = ?
8. How LLMs work - Token Embeddings
Google - Kotlin + Swift = Apple
9. How LLMs work - Token Embeddings
apple, 0.426
spotify, 0.396
amazon, 0.393
net
fl
ix, 0.387
yahoo, 0.382
snapchat, 0.371
huawei, 0.368
Google - Kotlin + Swift =
10. How LLMs work - GPT architecture
Input processing
Transformer layer
Transformer layer
Transformer layer
120 layers (GPT4)
Next word prediction
Decoder
Input
Output
Input +
next word
12. Generative AI as a service - OpenAI API
{
"model": "gpt-3.5-turbo",
"messages": [{
"role": "user",
"content": "Describe Turin in one sentence"
}],
"temperature": 0.7
}
POST https://api.openai.com/v1/chat/completions
13. Generative AI as a service - OpenAI API
struct AIMessage: Codable {
let role: String
let prompt: String
}
struct AIRequest: Codable {
let model: String
let messages: [AIMessage]
let temperature: Double
}
extension AIRequest {
static func gpt3_5Turbo(prompt: String)
-
>
AIRequest {
.init(model: "gpt-3.5-turbo",
messages: [
.init(role: "user", prompt: prompt)
],
temperature: 0.7
)
}
}
14. Generative AI as a service - OpenAI API
}
@MainActor
func invoke(prompt: String) async throws
-
>
String {
15. Generative AI as a service - OpenAI API
let (data, response) = try await URLSession.shared.data(for: request)
guard let httpURLResponse = (response as? HTTPURLResponse) else {
throw AIServiceError.badResponse
}
guard 200
.
.
.
399 ~= httpURLResponse.statusCode else {
throw AIServiceError.httpError(
code: httpURLResponse.statusCode,
description: String(data: data, encoding: .utf8)
?
?
""
)
}
let aiResponse = try decoder.decode(AIResponse.self, from: data)
return aiResponse.choices.f
i
rst
?
.
message.content
?
?
""
}
let openAIURL = URL(string: "https:
/
/
api.openai.com/v1/chat/completions")!
var request = URLRequest(url: openAIURL)
for (f
i
eld, value) in conf
i
g {
request.setValue(value, forHTTPHeaderField: f
i
eld)
}
request.httpMethod = "POST"
request.httpBody = try encoder.encode(AIRequest.gpt3_5Turbo(prompt: prompt))
@MainActor
func invoke(prompt: String) async throws
-
>
String {
20. Generative AI as a service work - Amazon Bedrock
Model Input Output
Titan Text Lite $0.30 / 1M tokens $0.40 / 1M tokens
Claude 3 Haiku $0.25 / 1M tokens $0.15 / 1M tokens
Titan Image
Generator
$0.005 / per image
21. Generative AI as a service work - Amazon Bedrock
AWS
API Gateway
Lambda
Bedrock
iOS app
22. Generative AI as a service work - Amazon Bedrock
def invoke_model(prompt: str)
:
try:
enclosed_prompt = "Human: " + prompt + "nnAssistant:"
body = {
"prompt": enclosed_prompt,
"max_tokens_to_sample": 500,
"temperature": 0.5,
"stop_sequences": ["nnHuman:"],
}
bedrock_runtime_client = boto3.client('bedrock
-
runtime')
response = bedrock_runtime_client.invoke_model(
modelId = "anthropic.claude
-
v2", body = json.dumps(body)
)
response_body = json.loads(response["body"].read())
completion = response_body["completion"]
return completion
except Exception as e:
raise e
23. Generative AI as a service work - Amazon Bedrock
def lambda_handler(event, context)
:
body = json.loads(event["body"])
prompt = body["content"]
if (event["headers"]["authorization"]
!
=
key)
:
return {
'statusCode': 401
}
try:
completion = invoke_model(prompt = prompt)
return {
'statusCode': 200,
'body': json.dumps({"content": completion})
}
except Exception as e:
return {
'statusCode': 400,
'body': json.dumps(str(e))
}
24. Generative AI as a service work - Amazon Bedrock
POST https://[uuid]...amazonaws.com/dev/completions
{
"content": "Describe Turin in one sentence"
}
25. Generative AI as a service work - Amazon Bedrock
def schedule_prompt_template(content: str)
-
>
str:
return f"""
<context>
.
.
.
/
/
Swift Heroes 2024 schedule
.
.
.
.
.
.
<
/
context>
Task for Assistent:
Find the most relevant answer in the context
<question>
{content}
<
/
question>
"""
26. Generative AI as a service work - Amazon Bedrock
def lambda_handler(event, context)
:
…
try:
templatedPrompt = prompt
if template
=
=
"schedule":
templatedPrompt = schedule_prompt_template(content = prompt)
elif template
=
=
"chat":
templatedPrompt = chat_context_template(content = prompt)
completion = invoke_claude(prompt = templatedPrompt)
return {
'statusCode': 200,
'body': json.dumps({"content": completion})
}
except Exception as e:
…
27. Generative AI as a service work - Amazon Bedrock
POST https://[uuid]...amazonaws.com/dev/completions
{
"content": "Describe Turin in one sentence”,
"template": “schedule"
}
28. Generative AI as a service work - Amazon Bedrock
@MainActor
func invokeBedrock(content: String, template: AITemplate = .schedule) async throws
-
>
String {
…
request.httpBody = try encoder.encode(BedrockAIRequest(content: content,
template: template.rawValue))
…
let bedrockResponse = try decoder.decode(BedrockAIResponse.self, from: data)
return bedrockResponse.content
}
let answer = try await aiService.invokeBedrock(content: text, template: .schedule)
struct BedrockAIRequest: Codable {
let content: String
let template: String
}
enum AITemplate: String {
case schedule, chat
}
31. Prompt engineering
The emerging skill set focused on designing and optimizing inputs for AI
systems to ensure they generate optimal outputs.
32. Prompt engineering patterns
One-shot, Few-shot and Zero-shot prompts
Summarize all the talks between context tags
Provide output as an array of json objects with a title, one main topic, and an array of tags.
The tags should not repeat the main topic.
<example>
Title: "Inclusive design: crafting Accessible apps for all of us"
Abstract: "One of the main goals for an app developer should be to reach everyone. ...
<result>
{{
"title": "Inclusive design: crafting Accessible apps for all of us"
"topic": "Accessibility"
"tags": ["user experience", "design", "assistance", "tools", "SwiftUI", "Swift", "Apple"]
}}
</result>
<context>
// Swift Heroes 2024 schedule …
</context>
</example>
33. Prompt engineering patterns
[
{
"title": "Building Swift CLIs that your users and contributors love",
"topic": "Swift CLIs",
"tags": ["terminal", "Tuist", "contributors"]
},
{
"title": "Macro Polo: A new generation of code generation",
"topic": "Swift Macros",
"tags": ["code generation", "modules", "examples"]
},
{
"title": "Typestate - the new Design Pattern in Swift 5.9",
"topic": "Typestate Design Pattern",
"tags": ["state machines", "memory ownership", "generics", "constraints", "safety"]
},
{
"title": "A Tale of Modular Architecture with SPM",
"topic": "Modular Architecture",
"tags": ["MVVM", "SwiftUI", "SPM"]
},
.
.
.
]
34. Prompt engineering patterns
Persona and audience
How does async/await work in Swift?
Act as an Experienced iOS developer
Explain to 12 years old
35. Prompt engineering patterns
Chain-of-thought
How does async/await work in Swift?
Act as an Experienced iOS developer
Explain to the Product manager
Follow this Chain-of-thought:
- Technical details using understandable to the audience analogy
- Bene
fi
ts relevant for the audience
- Additional things to consider
40. Pair programming with AI tools - ChatGPT
Discovering Swift Packages
Converting from legacy code
Writing unit tests and generating test data
Multiplatform development
Brainstorming and evaluating ideas