Introduction
The Raku package “WWW::OpenAI” provides access to the machine learning service OpenAI, [OAI1]. For more details of the OpenAI’s API usage see the documentation, [OAI2].
The package “WWW::OpenAI” was proclaimed approximately two months ago — see [AA1]. This blog post shows all the improvements and additions since then together with the “original” features.
Remark: The Raku package “WWW::OpenAI” is much “less ambitious” than the official Python package, [OAIp1], developed by OpenAI’s team.
Demo videos
- The video “Racoons playing with pearls and onions”, [AAv1], was made using “WWW::OpenAI”.
- The video “Streamlining ChatGPT code generation and narration workflows (Raku)”, [AAv4], was made for Raku code generation and execution using Wolfram Language (WL) packages to access OpenAI’s functionalities. (In a WL notebook.)
Functionalities
- Universal “front-end” – to access OpenAI’s services
- Models — list OpenAI models
- Code generation — code generation using chat- and text-completion
- Image generation — using DALL-E
- Moderation — probabilities of different moderation labels
- Audio transcription and translation — voice to text
- Embeddings — text into numerical vector
- Finding textual answers — text inquiries
- CLI — for the “best” UI
Before running the usage examples…
Remark: To use the OpenAI API one has to register and obtain authorization key.
Remark: When the authorization key, auth-key
, is specified to be Whatever
then the functions openai-*
attempt to use the env variable OPENAI_API_KEY
.
The following flowchart corresponds to the steps in the package function openai-playground
:

Universal “front-end”
The package has an universal “front-end” function openai-playground
for the different functionalities provided by OpenAI.
Here is a simple call for a “chat completion”:
use WWW::OpenAI; openai-playground('Where is Roger Rabbit?', max-tokens => 64);
# [{finish_reason => stop, index => 0, logprobs => (Any), text =>
#
# Roger Rabbit is a fictional character created by Disney in 1988. He has appeared in several movies and television shows, but is not an actual person.}]
Another one using Bulgarian:
openai-playground('Колко групи могат да се намерят в този облак от точки.', max-tokens => 64);
# [{finish_reason => length, index => 0, logprobs => (Any), text =>
#
# В зависимост от размера на облака от точки, може да бъдат}]
Remark: The function openai-completion
can be used instead in the examples above. See the section “Create chat completion” of [OAI2] for more details.
Models
The current OpenAI models can be found with the function openai-models
:
openai-models
# (ada ada-code-search-code ada-code-search-text ada-search-document ada-search-query ada-similarity ada:2020-05-03 babbage babbage-code-search-code babbage-code-search-text babbage-search-document babbage-search-query babbage-similarity babbage:2020-05-03 code-davinci-edit-001 code-search-ada-code-001 code-search-ada-text-001 code-search-babbage-code-001 code-search-babbage-text-001 curie curie-instruct-beta curie-search-document curie-search-query curie-similarity curie:2020-05-03 cushman:2020-05-03 davinci davinci-if:3.0.0 davinci-instruct-beta davinci-instruct-beta:2.0.0 davinci-search-document davinci-search-query davinci-similarity davinci:2020-05-03 gpt-3.5-turbo gpt-3.5-turbo-0301 if-curie-v2 if-davinci-v2 if-davinci:3.0.0 text-ada-001 text-ada:001 text-babbage-001 text-babbage:001 text-curie-001 text-curie:001 text-davinci-001 text-davinci-002 text-davinci-003 text-davinci-edit-001 text-davinci:001 text-embedding-ada-002 text-search-ada-doc-001 text-search-ada-query-001 text-search-babbage-doc-001 text-search-babbage-query-001 text-search-curie-doc-001 text-search-curie-query-001 text-search-davinci-doc-001 text-search-davinci-query-001 text-similarity-ada-001 text-similarity-babbage-001 text-similarity-curie-001 text-similarity-davinci-001 whisper-1)
Code generation
There are two types of completions : text and chat. Let us illustrate the differences of their usage by Raku code generation. Here is a text completion:
openai-completion( 'generate Raku code for making a loop over a list', type => 'text', max-tokens => 120, format => 'values');
# my @list = <a b c d e f g h i j>; # for @list -> $item { # say $item; # }
Here is a chat completion:
openai-completion( 'generate Raku code for making a loop over a list', type => 'chat', max-tokens => 120, format => 'values');
# Here's an example of how to make a loop over a list in Raku: # # ``` # my @list = (1, 2, 3, 4, 5); # # for @list -> $item { # say $item; # } # ``` # # In this code, we define a list `@list` with some values. Then, we use a `for` loop to iterate over each item in the list. The `-> $item` syntax specifies that we want to assign each item to the variable `$item` as we loop through the list. Finally, we use the
Remark: The argument “type” and the argument “model” have to “agree.” (I.e. be found agreeable by OpenAI.) For example:
model => 'text-davinci-003'
impliestype => 'text'
model => 'gpt-3.5-turbo'
impliestype => 'chat'
Remark: The video “Streamlining ChatGPT code generation and narration workflows (Raku)”, uses Raku and Wolfram Language (WL) made Raku- and OpenAI-access packages. Nevertheless, the code generation demonstrated can be replicated with “WWW::OpenAI”.
Image generation
Remark: See the files “Image-generation*” for more details.
Images can be generated with the function openai-create-image
— see the section “Images” of [OAI2].
Here is an example:
my $imgB64 = openai-create-image( "racoon with a sliced onion in the style of Raphael", response-format => 'b64_json', n => 1, size => 'small', format => 'values', method => 'tiny');
Here are the options descriptions:
response-format
takes the values “url” and “b64_json”n
takes a positive integer, for the number of images to be generatedsize
takes the values ‘1024×1024’, ‘512×512’, ‘256×256’, ‘large’, ‘medium’, ‘small’.
Here we generate an image, get its URL, and place (embed) a link to it via the output of the code cell:
my @imgRes = |openai-create-image( "racoon and onion in the style of Roy Lichtenstein", response-format => 'url', n => 1, size => 'small', method => 'tiny'); '';
Moderation
Here is an example of using OpenAI’s moderation:
my @modRes = |openai-moderation( "I want to kill them!", format => "values", method => 'tiny'); for @modRes -> $m { .say for $m.pairs.sort(*.value).reverse; }
# violence => 0.9635829329490662 # hate => 0.2717878818511963 # hate/threatening => 0.006235524546355009 # sexual => 8.503619142175012e-07 # violence/graphic => 2.7227645915672838e-08 # self-harm => 1.6152158499593838e-09 # sexual/minors => 1.3727728953583096e-09
Audio transcription and translation
Here is an example of using OpenAI’s audio transcription:
my $fileName = $*CWD ~ '/resources/HelloRaccoonsEN.mp3'; say openai-audio( $fileName, format => 'json', method => 'tiny');
# { # "text": "Raku practitioners around the world, eat more onions!" # }
To do translations use the named argument type
:
my $fileName = $*CWD ~ '/resources/HowAreYouRU.mp3'; say openai-audio( $fileName, type => 'translations', format => 'json', method => 'tiny');
# { # "text": "How are you, bandits, hooligans? I've lost my mind because of you. I've been working as a guard for my whole life." # }
Embeddings
Embeddings can be obtained with the function openai-embeddings
. Here is an example of finding the embedding vectors for each of the elements of an array of strings:
my @queries = [ 'make a classifier with the method RandomForeset over the data dfTitanic', 'show precision and accuracy', 'plot True Positive Rate vs Positive Predictive Value', 'what is a good meat and potatoes recipe' ]; my $embs = openai-embeddings(@queries, format => 'values', method => 'tiny'); $embs.elems;
# 4
Here we show:
- That the result is an array of four vectors each with length 1536
- The distributions of the values of each vector
use Data::Reshapers; use Data::Summarizers; say "\$embs.elems : { $embs.elems }"; say "\$embs>>.elems : { $embs>>.elems }"; records-summary($embs.kv.Hash.&transpose);
# $embs.elems : 4
# $embs>>.elems : 1536 1536 1536 1536
# +--------------------------------+------------------------------+-------------------------------+-------------------------------+
# | 3 | 1 | 0 | 2 |
# +--------------------------------+------------------------------+-------------------------------+-------------------------------+
# | Min => -0.6049936 | Min => -0.6674932 | Min => -0.5897995 | Min => -0.6316293 |
# | 1st-Qu => -0.0128846505 | 1st-Qu => -0.012275769 | 1st-Qu => -0.013175397 | 1st-Qu => -0.0125476065 |
# | Mean => -0.00075456833016081 | Mean => -0.000762535416627 | Mean => -0.0007618981246602 | Mean => -0.0007296895499115 |
# | Median => -0.00069939 | Median => -0.0003188204 | Median => -0.00100605615 | Median => -0.00056341792 |
# | 3rd-Qu => 0.012142678 | 3rd-Qu => 0.011146013 | 3rd-Qu => 0.012387738 | 3rd-Qu => 0.011868718 |
# | Max => 0.22202122 | Max => 0.22815572 | Max => 0.21172291 | Max => 0.21270473 |
# +--------------------------------+------------------------------+-------------------------------+-------------------------------+
Here we find the corresponding dot products and (cross-)tabulate them:
use Data::Reshapers; use Data::Summarizers; my @ct = (^$embs.elems X ^$embs.elems).map({ %( i => $_[0], j => $_[1], dot => sum($embs[$_[0]] >>*<< $embs[$_[1]])) }).Array; say to-pretty-table(cross-tabulate(@ct, 'i', 'j', 'dot'), field-names => (^$embs.elems)>>.Str);
# +---+----------+----------+----------+----------+
# | | 0 | 1 | 2 | 3 |
# +---+----------+----------+----------+----------+
# | 0 | 1.000000 | 0.724412 | 0.756557 | 0.665149 |
# | 1 | 0.724412 | 1.000000 | 0.811169 | 0.715543 |
# | 2 | 0.756557 | 0.811169 | 1.000000 | 0.698977 |
# | 3 | 0.665149 | 0.715543 | 0.698977 | 1.000000 |
# +---+----------+----------+----------+----------+
Remark: Note that the fourth element (the cooking recipe request) is an outlier. (Judging by the table with dot products.)
Finding textual answers
Here is an example of finding textual answers:
my $text = "Lake Titicaca is a large, deep lake in the Andes on the border of Bolivia and Peru. By volume of water and by surface area, it is the largest lake in South America"; openai-find-textual-answer($text, "Where is Titicaca?")
# [Andes on the border of Bolivia and Peru .]
By default openai-find-textual-answer
tries to give short answers. If the option “request” is Whatever
then depending on the number of questions the request is one those phrases:
- “give the shortest answer of the question:”
- “list the shortest answers of the questions:”
In the example above the full query given to OpenAI’s models is
Given the text “Lake Titicaca is a large, deep lake in the Andes on the border of Bolivia and Peru. By volume of water and by surface area, it is the largest lake in South America” give the shortest answer of the question:
Where is Titicaca?
Here we get a longer answer by changing the value of “request”:
openai-find-textual-answer($text, "Where is Titicaca?", request => "answer the question:")
# [Titicaca is in the Andes on the border of Bolivia and Peru .]
Remark: The function openai-find-textual-answer
is inspired by the Mathematica function FindTextualAnswer
; see [JL1]. Unfortunately, at this time implementing the full signature of FindTextualAnswer
with OpenAI’s API is not easy. (Or cheap to execute.)
Multiple questions
If several questions are given to the function openai-find-textual-answer
then all questions are spliced with the given text into one query (that is sent to OpenAI.)
For example, consider the following text and questions:
my $query = 'Make a classifier with the method RandomForest over the data dfTitanic; show precision and accuracy.'; my @questions = ['What is the dataset?', 'What is the method?', 'Which metrics to show?' ];
Then the query send to OpenAI is:
Given the text: “Make a classifier with the method RandomForest over the data dfTitanic; show precision and accuracy.” list the shortest answers of the questions:
- What is the dataset?
- What is the method?
- Which metrics to show?
The answers are assumed to be given in the same order as the questions, each answer in a separated line. Hence, by splitting the OpenAI result into lines we get the answers corresponding to the questions.
If the questions are missing question marks, it is likely that the result may have a completion as a first line followed by the answers. In that situation the answers are not parsed and a warning message is given.
CLI
Playground access
The package provides a Command Line Interface (CLI) script:
openai-playground --help
# Usage:
# openai-playground <text> [--path=<Str>] [-n[=UInt]] [--max-tokens[=UInt]] [-m|--model=<Str>] [-r|--role=<Str>] [-t|--temperature[=Real]] [-l|--language=<Str>] [--response-format=<Str>] [-a|--auth-key=<Str>] [--timeout[=UInt]] [--format=<Str>] [--method=<Str>] -- Text processing using the OpenAI API.
# openai-playground [<words> ...] [-m|--model=<Str>] [--path=<Str>] [-n[=UInt]] [--max-tokens[=UInt]] [-r|--role=<Str>] [-t|--temperature[=Real]] [-l|--language=<Str>] [--response-format=<Str>] [-a|--auth-key=<Str>] [--timeout[=UInt]] [--format=<Str>] [--method=<Str>] -- Command given as a sequence of words.
#
# <text> Text to be processed or audio file name.
# --path=<Str> Path, one of 'chat/completions', 'images/generations', 'moderations', 'audio/transcriptions', 'audio/translations', 'embeddings', or 'models'. [default: 'chat/completions']
# -n[=UInt] Number of completions or generations. [default: 1]
# --max-tokens[=UInt] The maximum number of tokens to generate in the completion. [default: 100]
# -m|--model=<Str> Model. [default: 'Whatever']
# -r|--role=<Str> Role. [default: 'user']
# -t|--temperature[=Real] Temperature. [default: 0.7]
# -l|--language=<Str> Language. [default: '']
# --response-format=<Str> The format in which the generated images are returned; one of 'url' or 'b64_json'. [default: 'url']
# -a|--auth-key=<Str> Authorization key (to use OpenAI API.) [default: 'Whatever']
# --timeout[=UInt] Timeout. [default: 10]
# --format=<Str> Format of the result; one of "json" or "hash". [default: 'json']
# --method=<Str> Method for the HTTP POST query; one of "tiny" or "curl". [default: 'tiny']
Remark: When the authorization key argument “auth-key” is specified set to “Whatever” then openai-playground
attempts to use the env variable OPENAI_API_KEY
.
Finding textual answers
The package provides a CLI script for finding textual answers:
openai-find-textual-answer --help
# Usage:
# openai-find-textual-answer <text> -q=<Str> [--max-tokens[=UInt]] [-m|--model=<Str>] [-t|--temperature[=Real]] [-r|--request=<Str>] [-p|--pairs] [-a|--auth-key=<Str>] [--timeout[=UInt]] [--format=<Str>] [--method=<Str>] -- Text processing using the OpenAI API.
# openai-find-textual-answer [<words> ...] -q=<Str> [--max-tokens[=UInt]] [-m|--model=<Str>] [-t|--temperature[=Real]] [-r|--request=<Str>] [-p|--pairs] [-a|--auth-key=<Str>] [--timeout[=UInt]] [--format=<Str>] [--method=<Str>] -- Command given as a sequence of words.
#
# <text> Text to be processed or audio file name.
# -q=<Str> Questions separated with '?' or ';'.
# --max-tokens[=UInt] The maximum number of tokens to generate in the completion. [default: 300]
# -m|--model=<Str> Model. [default: 'Whatever']
# -t|--temperature[=Real] Temperature. [default: 0.7]
# -r|--request=<Str> Request. [default: 'Whatever']
# -p|--pairs Should question-answer pairs be returned or not? [default: False]
# -a|--auth-key=<Str> Authorization key (to use OpenAI API.) [default: 'Whatever']
# --timeout[=UInt] Timeout. [default: 10]
# --format=<Str> Format of the result; one of "json" or "hash". [default: 'json']
# --method=<Str> Method for the HTTP POST query; one of "tiny" or "curl". [default: 'tiny']
Refactoring
Separate files for each OpenAI functionality
The original implementation of “WWW::OpenAI” had design and implementation that were very similar to those of “Lingua::Translation::DeepL”, [AAp1].
Major refactoring of the original code was done — now each OpenAI functionality targeted by “WWW::OpenAI” has its code placed in a separate file.
In order to do the refactoring, of course, a comprehensive enough suite of unit tests had to be put in place. Since running the tests costs money, the tests are placed in the “./xt” directory.
De-Cro-ing the requesting code
The first implementation of “WWW::OpenAI” used “Cro::HTTP::Client” to access OpenAI’s services. Often when I use “Cro::HTTP::Client” on macOS I get the errors:
Cannot locate symbol ‘SSL_get1_peer_certificate’ in native library
(See longer discussions about this problem here and here.)
Given the problems of using “Cro::HTTP::Client” and the implementations with curl
and “HTTP::Tiny”, I decided it is better to make the implementation of “WWW::OpenAI” more lightweight by removing the code related to “Cro::HTTP::Client”.
References
Articles
[AA1] Anton Antonov, “WWW::OpenAI (for ChatGPT and other statistical gimmicks)”, (2023), RakuForPrediction at WordPress.
[JL1] Jérôme Louradour, “New in the Wolfram Language: FindTextualAnswer”, (2018), blog.wolfram.com.
Packages, platforms
[AAp1] Anton Antonov, Lingua::Translation::DeepL Raku package, (2022), GitHub/antononcube.
[AAp2] Anton Antonov, Text::CodeProcessing, (2021), GitHub/antononcube.
[OAI1] OpenAI Platform, OpenAI platform.
[OAI2] OpenAI Platform, OpenAI documentation.
[OAIp1] OpenAI, OpenAI Python Library, (2020), GitHub/openai.
Videos
[AAv1] Anton Antonov, “Racoons playing with pearls and onions”, (2023), YouTube/@AAA4prediction.
[AAv2] Anton Antonov, “OpenAIMode demo (Mathematica)”, (2023), YouTube/@AAA4prediction.
[AAv3] Anton Antonov, “OpenAIMode code generation workflows demo (Mathematica et al.)”, (2023), YouTube/@AAA4prediction.
[AAv4] Anton Antonov, “Streamlining ChatGPT code generation and narration workflows (Raku)”, (2023), YouTube/@AAA4prediction.
One thought on “Further work on WWW::OpenAI”