Below is a simplified chat bubble for an conversation with an LLM.
This template is written in Templ and uses htmx to fill the
chat bubble with incoming content. The incoming content is sent with SSE using
the v2 version r3labs/sse Server
Sent Events server. This version supports SplitData
which makes it easier to
sent the HTML for htmx. Adding a <span></span>
around every token helps with
the spaces in the tokens.
templ StreamingChatBubble(url string, message string) {
<div>
<span>user</span>
<div>{{message}}</div>
</div>
<div
hx-ext="sse"
sse-connect={ url }
sse-swap="done"
hx-swap="outerHTML"
>
<span>assistant</span>
<div
sse-swap="message"
hx-swap="beforeend"
></div>
</div>
}
templ StreamingChatComplete(message string) {
<div class="...">
<span>assistant</span>
<div>{{message}}</div>
</div>
}
The url
contains the SSE url with a stream
parameter. The Go web server uses
the code to find the right channel for updates. The updates to the channel are
sent to the frontend. Each of the messages has a type. The messages with partial
LLM response have the type message
. And then when all partial responses are
sent, the done
message is sent with the complete chat bubble. This also
removes the connection to the message channel.
type: message
data: <span> token1</span>
type: message
data: <span>token2</span>
type: message
data: <span> token3</span>
type: done
data: <div>
data: <div>token1token2 token3</div>
data: </div>
The SSE server (simplified)
To use the SSE server, you need an endpoint for the server itself (called /events
here).
Next you need a way to start a new stream. In a chat application you have a textarea
in a form that POSTs to the /streaming
endpoint.
<form hx-post="/streaming" hx-trigger="submit" hx-target="#chat-messages" hx-swap="beforeend">
<textrea name="prompt"></textarea>
<button type="submit">Chat</button>
</form>
<div id="chat-messages"></div>
func main() {
sseServer := sse.New()
sseServer.SplitData = true
http.Handle("/events", sseServer)
http.HandleFunc("/streaming", func (w http.ResponseWriter, r *http.Request) {
streamID := fmt.Sprintf("stream-%d", rand.Int63())
sseServer.CreateStream(streamID)
prompt := r.FormValue("prompt")
go func() {
defer sseServer.RemoveStream(streamID)
// call LLM and get tokens
res, err := llm.CreateCompletion(...)
// publish every token
contentBuilder := strings.Builder{}
for ... {
sseServer.Publish(streamID, &sse.Event{
Event: []byte("message"),
Data: []byte("<span>" + token + "</span>"),
})
contentBuilder.WriteString(token)
}
output := bytes.Buffer{}
component.StreamingChatComplete(contentBuilder.String()).Render(r.Context(), w)
sseServer.Publish(streamID, &sse.Event{
Event: []byte("done"),
Data: output.Bytes(),
})
}()
component.StreamingChatBubble(fmt.Sprintf("/events?stream=%s", streamID), message).Render(r.Context(), w)
})
http.ListenAndServe(":8080", nil)
}