WWW::PaLM (for Bard and other hallucinators)

This blog post proclaims the Raku package “WWW::PaLM” for connecting with PaLM (Pathways Language Model).

The design and implementation of the package closes follows that of “WWW::OpenAI”, [AAp1].

Remark: At this point the functionalities provided over PaLM API are “only” about text generation, message generation, and text embedding. Hence, a comprehensive comparison of PaLM and OpenAI with, say, “WWW::PaLM” and “WWW::OpenAI” will use only those three (most important) functionalities.

Remark: Text generation with PaLM API has includes moderation, hence comparing the moderations of PaLM with OpenAI is possible (and relatively easy.)

Installation

From Zef ecosystem:

zef install WWW::PaLM

From GitHub:

zef install https://github.com/antononcube/Raku-WWW-PaLM

Usage examples

Show models:

use WWW::PaLM;

palm-models()
# (models/chat-bison-001 models/text-bison-001 models/embedding-gecko-001)

Show text generation:

.say for palm-generate-text('what is the population in Brazil?', format => 'values', n => 3);
# The population in Brazil is 213,317,650.
# The population of Brazil is 212,609,039.
# The population of Brazil is 211,018,526.

Show message generation:

.say for palm-generate-message('Who wrote the book "Dune"?');
# {candidates => [{author => 1, content => Frank Herbert wrote the book "Dune". It was first published in 1965 and is considered a classic of science fiction literature. The book tells the story of Paul Atreides, a young man who is thrust into a war for control of the desert planet Arrakis. Dune has been adapted into several films and television series, and its popularity has only grown in recent years.}], messages => [{author => 0, content => Who wrote the book "Dune"?}]}

Show text embeddings:

my @vecs = palm-embed-text(["say something nice!",
                            "shout something bad!",
                            "wher is the best coffee made?"],
        format => 'values');

.say for @vecs;
# [0.0011238843 -0.040586308 -0.013174802 0.015497498 0.04383781 0.012527679 0.017876161 0.031339817 -0.0042566974 -0.024129443 -0.023050068 -0.015625203 0.03501345 -0.006033779 -0.011984176 -0.033368077 -0.040653296 0.022117265 -0.02034076 -0.040752005 -0.12748374 0.029760985 0.00084632 -0.017502416 -0.03893842 -0.07151896 0.0609997 -0.0046266303 -0.044301335 -0.022592714 0.023920823 0.0020489343 -0.0048049283 -0.038431767 0.007275116 0.018562535 0.017131427 -0.00043720857 0.02810143 0.053296003 0.031831037 -0.067091785 0.015640317 -0.0036152988 -0.04691379 -0.044070054 -0.022364588 -0.0083763655 -0.0490714 -0.007302964 0.006516001 -0.004685413 0.03979989 -0.014196505 0.01065721 -0.0073698894 -0.036348466 -0.008763353 -0.01892632 -0.054501593 0.032021806 -0.007242739 0.0220439 -0.07687204 -0.0740849 0.01748987 -0.027381063 -0.015533608 -0.013165218 -0.04867313 0.041797243 0.017989235 -0.00982055 0.03691631 0.010164966 -0.03985747 0.024958102 0.015761184 0.02152081 -0.06986678 -0.012039569 0.00056548475 -0.030969337 -0.07435745 -0.028182292 0.012739926 0.042157806 0.023305222 -0.03230193 -0.0033747193 -0.061240233 0.021881713 0.009969781 -0.010255637 -0.0049942187 -0.034989387 0.020215971 -0.020086009 0.0010875552 0.017283859 ...]
# [0.008541122 -0.024862545 5.7058678e-05 0.0038867046 0.067829944 0.021747978 -0.029083902 -0.004656434 -0.012782786 -0.018360043 -0.019952606 0.007747536 -0.0056520714 -0.038842484 0.0144888265 -0.028959233 -0.034373183 0.0086536445 0.0030540912 0.005582053 -0.123638585 0.04185863 0.009929637 0.017287828 -0.018644271 -0.08450658 -0.014902289 -0.05526195 -0.020689102 -0.0024103096 0.019860586 0.024158986 -0.05455204 -0.01493725 0.0006892038 0.00323405 -0.0006728824 -0.010576684 -0.014293131 0.038979508 0.01347389 -0.050580844 0.004814549 -0.031785876 -0.017129553 -0.013408159 -0.019374503 0.008474182 0.0013470884 -0.01079351 -0.011820318 0.002939847 0.019571697 0.030853825 0.013096097 -0.015389974 -0.017276747 -0.0005738943 -0.029613884 -0.035326067 0.002852012 0.016189976 0.025315559 -0.07327017 -0.039907347 0.03896169 -0.050800353 -0.039951827 0.0016103057 -0.06276334 0.03115886 0.011537878 -0.034277618 0.015627442 0.020582594 -0.040219635 0.011975396 0.037462063 0.06526648 -0.068473086 -0.014561573 0.005528434 -0.0057561444 -0.0419415 -0.007748433 0.012975432 0.01270717 0.046910107 -0.038431607 -0.0022757803 -0.08964604 0.020557715 0.017819826 0.0076504718 -0.053408835 0.012460911 -0.021491397 -0.02011912 -0.011796679 0.007281153 ...]
# [-0.04053886 -0.028188298 0.0223612 0.04598037 0.018368412 -0.04011965 0.042926013 0.021050882 -0.0094422195 -0.01832212 0.019964134 -0.0013347232 0.04966835 0.048276108 -0.005766044 -0.02433299 -0.040598087 -0.065635845 0.02209697 0.06312037 -0.085269 -0.012106354 0.0056219148 0.0030168872 -0.07658486 -0.048572473 -0.04269039 0.026315054 -0.006883465 -0.010089017 0.009539455 -0.030142343 -0.058066234 -0.03262922 -0.0018920723 0.062120162 -0.008042096 0.04948106 -0.00028861413 0.026026316 0.04703804 -0.012234463 0.058435097 -0.05434934 -0.049611546 -0.020385403 -0.05354296 0.07173936 0.014864944 -0.016838027 -0.024500856 0.008486 -0.0027952811 0.016175343 0.019633992 -0.0010128785 -0.03879283 -0.0091371685 0.041809056 -0.024545915 -0.009020674 0.018044515 0.025408195 -0.07722929 -0.04280286 -0.0077904514 -0.0014599559 -0.045557976 -0.009131333 -0.038542382 0.0069833044 0.043738525 -0.011051313 -0.010426432 -0.05466805 0.013140538 0.0069446876 -0.035760716 0.04011649 -0.066007614 -0.035028048 -0.03821088 -0.031489357 -0.06864613 0.0075672227 -0.0032299925 0.007995906 0.0020973664 -0.02571021 0.070727535 -0.09410694 -0.011029921 -0.010794598 -0.022694781 -0.041388575 0.038010556 -0.006979773 -0.034256544 -0.015818557 -0.0064860974 ...]

Command Line Interface

Maker suite access

The package provides a Command Line Interface (CLI) script:

palm-prompt --help
# Usage:
#   palm-prompt <text> [--path=<Str>] [-n[=UInt]] [--max-tokens|--max-output-tokens[=UInt]] [-t|--temperature[=Real]] [-m|--model=<Str>] [-a|--auth-key=<Str>] [--timeout[=UInt]] [--format=<Str>] [--method=<Str>] -- Text processing using the OpenAI API.
#   palm-prompt [<words> ...] [--path=<Str>] [-n[=UInt]] [--max-tokens|--max-output-tokens[=UInt]] [-m|--model=<Str>] [-t|--temperature[=Real]] [-a|--auth-key=<Str>] [--timeout[=UInt]] [--format=<Str>] [--method=<Str>] -- Command given as a sequence of words.
#   
#     <text>                                     Text to be processed or audio file name.
#     --path=<Str>                               Path, one of 'generateText', 'generateMessage', 'embedText', or 'models'. [default: 'generateText']
#     -n[=UInt]                                  Number of completions or generations. [default: 1]
#     --max-tokens|--max-output-tokens[=UInt]    The maximum number of tokens to generate in the completion. [default: 100]
#     -t|--temperature[=Real]                    Temperature. [default: 0.7]
#     -m|--model=<Str>                           Model. [default: 'Whatever']
#     -a|--auth-key=<Str>                        Authorization key (to use PaLM API.) [default: 'Whatever']
#     --timeout[=UInt]                           Timeout. [default: 10]
#     --format=<Str>                             Format of the result; one of "json" or "hash". [default: 'json']
#     --method=<Str>                             Method for the HTTP POST query; one of "tiny" or "curl". [default: 'tiny']

Remark: When the authorization key argument “auth-key” is specified set to “Whatever” then palm-prompt attempts to use the env variable PALM_API_KEY.


Mermaid diagram

The following flowchart corresponds to the steps in the package function palm-prompt:


TODO

  • Implement moderations.
  • Comparison with “WWW::OpenAI”, [AAp1].
  • Hook-up finding textual answers implemented in “WWW::OpenAI”, [AAp1].

References

Packages, platforms

[AAp1] Anton Antonov, WWW::OpenAI Raku package, (2023), GitHub/antononcube.

[AAp2] Anton Antonov, Lingua::Translation::DeepL Raku package, (2022), GitHub/antononcube.

[OAI1] OpenAI Platform, OpenAI platform.

[ZG1] Zoubin Ghahramani, “Introducing PaLM 2”, (2023), Google Official Blog on AI.

Further work on WWW::OpenAI

Introduction

The Raku package “WWW::OpenAI” provides access to the machine learning service OpenAI, [OAI1]. For more details of the OpenAI’s API usage see the documentation, [OAI2].

The package “WWW::OpenAI” was proclaimed approximately two months ago — see [AA1]. This blog post shows all the improvements and additions since then together with the “original” features.

Remark: The Raku package “WWW::OpenAI” is much “less ambitious” than the official Python package, [OAIp1], developed by OpenAI’s team.

Demo videos

Functionalities

  • Universal “front-end” – to access OpenAI’s services
  • Models — list OpenAI models
  • Code generation — code generation using chat- and text-completion
  • Image generation — using DALL-E
  • Moderation — probabilities of different moderation labels
  • Audio transcription and translation — voice to text
  • Embeddings — text into numerical vector
  • Finding textual answers — text inquiries
  • CLI — for the “best” UI

Before running the usage examples…

Remark: To use the OpenAI API one has to register and obtain authorization key.

Remark: When the authorization key, auth-key, is specified to be Whatever then the functions openai-* attempt to use the env variable OPENAI_API_KEY.

The following flowchart corresponds to the steps in the package function openai-playground:


Universal “front-end”

The package has an universal “front-end” function openai-playground for the different functionalities provided by OpenAI.

Here is a simple call for a “chat completion”:

use WWW::OpenAI;
openai-playground('Where is Roger Rabbit?', max-tokens => 64);
# [{finish_reason => stop, index => 0, logprobs => (Any), text => 
# 
# Roger Rabbit is a fictional character created by Disney in 1988. He has appeared in several movies and television shows, but is not an actual person.}]

Another one using Bulgarian:

openai-playground('Колко групи могат да се намерят в този облак от точки.', max-tokens => 64);
# [{finish_reason => length, index => 0, logprobs => (Any), text => 
# 
# В зависимост от размера на облака от точки, може да бъдат}]

Remark: The function openai-completion can be used instead in the examples above. See the section “Create chat completion” of [OAI2] for more details.


Models

The current OpenAI models can be found with the function openai-models:

openai-models
# (ada ada-code-search-code ada-code-search-text ada-search-document ada-search-query ada-similarity ada:2020-05-03 babbage babbage-code-search-code babbage-code-search-text babbage-search-document babbage-search-query babbage-similarity babbage:2020-05-03 code-davinci-edit-001 code-search-ada-code-001 code-search-ada-text-001 code-search-babbage-code-001 code-search-babbage-text-001 curie curie-instruct-beta curie-search-document curie-search-query curie-similarity curie:2020-05-03 cushman:2020-05-03 davinci davinci-if:3.0.0 davinci-instruct-beta davinci-instruct-beta:2.0.0 davinci-search-document davinci-search-query davinci-similarity davinci:2020-05-03 gpt-3.5-turbo gpt-3.5-turbo-0301 if-curie-v2 if-davinci-v2 if-davinci:3.0.0 text-ada-001 text-ada:001 text-babbage-001 text-babbage:001 text-curie-001 text-curie:001 text-davinci-001 text-davinci-002 text-davinci-003 text-davinci-edit-001 text-davinci:001 text-embedding-ada-002 text-search-ada-doc-001 text-search-ada-query-001 text-search-babbage-doc-001 text-search-babbage-query-001 text-search-curie-doc-001 text-search-curie-query-001 text-search-davinci-doc-001 text-search-davinci-query-001 text-similarity-ada-001 text-similarity-babbage-001 text-similarity-curie-001 text-similarity-davinci-001 whisper-1)

Code generation

There are two types of completions : text and chat. Let us illustrate the differences of their usage by Raku code generation. Here is a text completion:

openai-completion(
        'generate Raku code for making a loop over a list',
        type => 'text',
        max-tokens => 120,
        format => 'values');
# my @list = <a b c d e f g h i j>;
# for @list -> $item {
#     say $item;
# }

Here is a chat completion:

openai-completion(
        'generate Raku code for making a loop over a list',
        type => 'chat',
        max-tokens => 120,
        format => 'values');
# Here's an example of how to make a loop over a list in Raku:
# 
# ```
# my @list = (1, 2, 3, 4, 5);
# 
# for @list -> $item {
#     say $item;
# }
# ```
# 
# In this code, we define a list `@list` with some values. Then, we use a `for` loop to iterate over each item in the list. The `-> $item` syntax specifies that we want to assign each item to the variable `$item` as we loop through the list. Finally, we use the

Remark: The argument “type” and the argument “model” have to “agree.” (I.e. be found agreeable by OpenAI.) For example:

  • model => 'text-davinci-003' implies type => 'text'
  • model => 'gpt-3.5-turbo' implies type => 'chat'

Remark: The video “Streamlining ChatGPT code generation and narration workflows (Raku)”, uses Raku and Wolfram Language (WL) made Raku- and OpenAI-access packages. Nevertheless, the code generation demonstrated can be replicated with “WWW::OpenAI”.


Image generation

Remark: See the files “Image-generation*” for more details.

Images can be generated with the function openai-create-image — see the section “Images” of [OAI2].

Here is an example:

my $imgB64 = openai-create-image(
        "racoon with a sliced onion in the style of Raphael",
        response-format => 'b64_json',
        n => 1,
        size => 'small',
        format => 'values',
        method => 'tiny');

Here are the options descriptions:

  • response-format takes the values “url” and “b64_json”
  • n takes a positive integer, for the number of images to be generated
  • size takes the values ‘1024×1024’, ‘512×512’, ‘256×256’, ‘large’, ‘medium’, ‘small’.

Here we generate an image, get its URL, and place (embed) a link to it via the output of the code cell:

my @imgRes = |openai-create-image(
        "racoon and onion in the style of Roy Lichtenstein",
        response-format => 'url',
        n => 1,
        size => 'small',
        method => 'tiny');

'![](' ~ @imgRes.head<url> ~ ')';

Moderation

Here is an example of using OpenAI’s moderation:

my @modRes = |openai-moderation(
"I want to kill them!",
format => "values",
method => 'tiny');

for @modRes -> $m { .say for $m.pairs.sort(*.value).reverse; }
# violence => 0.9635829329490662
# hate => 0.2717878818511963
# hate/threatening => 0.006235524546355009
# sexual => 8.503619142175012e-07
# violence/graphic => 2.7227645915672838e-08
# self-harm => 1.6152158499593838e-09
# sexual/minors => 1.3727728953583096e-09

Audio transcription and translation

Here is an example of using OpenAI’s audio transcription:

my $fileName = $*CWD ~ '/resources/HelloRaccoonsEN.mp3';
say openai-audio(
        $fileName,
        format => 'json',
        method => 'tiny');
# {
#   "text": "Raku practitioners around the world, eat more onions!"
# }

To do translations use the named argument type:

my $fileName = $*CWD ~ '/resources/HowAreYouRU.mp3';
say openai-audio(
        $fileName,
        type => 'translations',
        format => 'json',
        method => 'tiny');
# {
#   "text": "How are you, bandits, hooligans? I've lost my mind because of you. I've been working as a guard for my whole life."
# }

Embeddings

Embeddings can be obtained with the function openai-embeddings. Here is an example of finding the embedding vectors for each of the elements of an array of strings:

my @queries = [
    'make a classifier with the method RandomForeset over the data dfTitanic',
    'show precision and accuracy',
    'plot True Positive Rate vs Positive Predictive Value',
    'what is a good meat and potatoes recipe'
];

my $embs = openai-embeddings(@queries, format => 'values', method => 'tiny');
$embs.elems;
# 4

Here we show:

  • That the result is an array of four vectors each with length 1536
  • The distributions of the values of each vector
use Data::Reshapers;
use Data::Summarizers;

say "\$embs.elems : { $embs.elems }";
say "\$embs>>.elems : { $embs>>.elems }";
records-summary($embs.kv.Hash.&transpose);
# $embs.elems : 4
# $embs>>.elems : 1536 1536 1536 1536
# +--------------------------------+------------------------------+-------------------------------+-------------------------------+
# | 3                              | 1                            | 0                             | 2                             |
# +--------------------------------+------------------------------+-------------------------------+-------------------------------+
# | Min    => -0.6049936           | Min    => -0.6674932         | Min    => -0.5897995          | Min    => -0.6316293          |
# | 1st-Qu => -0.0128846505        | 1st-Qu => -0.012275769       | 1st-Qu => -0.013175397        | 1st-Qu => -0.0125476065       |
# | Mean   => -0.00075456833016081 | Mean   => -0.000762535416627 | Mean   => -0.0007618981246602 | Mean   => -0.0007296895499115 |
# | Median => -0.00069939          | Median => -0.0003188204      | Median => -0.00100605615      | Median => -0.00056341792      |
# | 3rd-Qu => 0.012142678          | 3rd-Qu => 0.011146013        | 3rd-Qu => 0.012387738         | 3rd-Qu => 0.011868718         |
# | Max    => 0.22202122           | Max    => 0.22815572         | Max    => 0.21172291          | Max    => 0.21270473          |
# +--------------------------------+------------------------------+-------------------------------+-------------------------------+

Here we find the corresponding dot products and (cross-)tabulate them:

use Data::Reshapers;
use Data::Summarizers;
my @ct = (^$embs.elems X ^$embs.elems).map({ %( i => $_[0], j => $_[1], dot => sum($embs[$_[0]] >>*<< $embs[$_[1]])) }).Array;

say to-pretty-table(cross-tabulate(@ct, 'i', 'j', 'dot'), field-names => (^$embs.elems)>>.Str);
# +---+----------+----------+----------+----------+
# |   |    0     |    1     |    2     |    3     |
# +---+----------+----------+----------+----------+
# | 0 | 1.000000 | 0.724412 | 0.756557 | 0.665149 |
# | 1 | 0.724412 | 1.000000 | 0.811169 | 0.715543 |
# | 2 | 0.756557 | 0.811169 | 1.000000 | 0.698977 |
# | 3 | 0.665149 | 0.715543 | 0.698977 | 1.000000 |
# +---+----------+----------+----------+----------+

Remark: Note that the fourth element (the cooking recipe request) is an outlier. (Judging by the table with dot products.)


Finding textual answers

Here is an example of finding textual answers:

my $text = "Lake Titicaca is a large, deep lake in the Andes 
on the border of Bolivia and Peru. By volume of water and by surface 
area, it is the largest lake in South America";

openai-find-textual-answer($text, "Where is Titicaca?")
# [Andes on the border of Bolivia and Peru .]

By default openai-find-textual-answer tries to give short answers. If the option “request” is Whatever then depending on the number of questions the request is one those phrases:

  • “give the shortest answer of the question:”
  • “list the shortest answers of the questions:”

In the example above the full query given to OpenAI’s models is

Given the text “Lake Titicaca is a large, deep lake in the Andes on the border of Bolivia and Peru. By volume of water and by surface area, it is the largest lake in South America” give the shortest answer of the question:
Where is Titicaca?

Here we get a longer answer by changing the value of “request”:

openai-find-textual-answer($text, "Where is Titicaca?", request => "answer the question:")
# [Titicaca is in the Andes on the border of Bolivia and Peru .]

Remark: The function openai-find-textual-answer is inspired by the Mathematica function FindTextualAnswer; see [JL1]. Unfortunately, at this time implementing the full signature of FindTextualAnswer with OpenAI’s API is not easy. (Or cheap to execute.)

Multiple questions

If several questions are given to the function openai-find-textual-answer then all questions are spliced with the given text into one query (that is sent to OpenAI.)

For example, consider the following text and questions:

my $query = 'Make a classifier with the method RandomForest over the data dfTitanic; show precision and accuracy.';

my @questions =
        ['What is the dataset?',
         'What is the method?',
         'Which metrics to show?'
        ];

Then the query send to OpenAI is:

Given the text: “Make a classifier with the method RandomForest over the data dfTitanic; show precision and accuracy.” list the shortest answers of the questions:

  1. What is the dataset?
  2. What is the method?
  3. Which metrics to show?

The answers are assumed to be given in the same order as the questions, each answer in a separated line. Hence, by splitting the OpenAI result into lines we get the answers corresponding to the questions.

If the questions are missing question marks, it is likely that the result may have a completion as a first line followed by the answers. In that situation the answers are not parsed and a warning message is given.


CLI

Playground access

The package provides a Command Line Interface (CLI) script:

openai-playground --help
# Usage:
#   openai-playground <text> [--path=<Str>] [-n[=UInt]] [--max-tokens[=UInt]] [-m|--model=<Str>] [-r|--role=<Str>] [-t|--temperature[=Real]] [-l|--language=<Str>] [--response-format=<Str>] [-a|--auth-key=<Str>] [--timeout[=UInt]] [--format=<Str>] [--method=<Str>] -- Text processing using the OpenAI API.
#   openai-playground [<words> ...] [-m|--model=<Str>] [--path=<Str>] [-n[=UInt]] [--max-tokens[=UInt]] [-r|--role=<Str>] [-t|--temperature[=Real]] [-l|--language=<Str>] [--response-format=<Str>] [-a|--auth-key=<Str>] [--timeout[=UInt]] [--format=<Str>] [--method=<Str>] -- Command given as a sequence of words.
#   
#     <text>                     Text to be processed or audio file name.
#     --path=<Str>               Path, one of 'chat/completions', 'images/generations', 'moderations', 'audio/transcriptions', 'audio/translations', 'embeddings', or 'models'. [default: 'chat/completions']
#     -n[=UInt]                  Number of completions or generations. [default: 1]
#     --max-tokens[=UInt]        The maximum number of tokens to generate in the completion. [default: 100]
#     -m|--model=<Str>           Model. [default: 'Whatever']
#     -r|--role=<Str>            Role. [default: 'user']
#     -t|--temperature[=Real]    Temperature. [default: 0.7]
#     -l|--language=<Str>        Language. [default: '']
#     --response-format=<Str>    The format in which the generated images are returned; one of 'url' or 'b64_json'. [default: 'url']
#     -a|--auth-key=<Str>        Authorization key (to use OpenAI API.) [default: 'Whatever']
#     --timeout[=UInt]           Timeout. [default: 10]
#     --format=<Str>             Format of the result; one of "json" or "hash". [default: 'json']
#     --method=<Str>             Method for the HTTP POST query; one of "tiny" or "curl". [default: 'tiny']

Remark: When the authorization key argument “auth-key” is specified set to “Whatever” then openai-playground attempts to use the env variable OPENAI_API_KEY.

Finding textual answers

The package provides a CLI script for finding textual answers:

openai-find-textual-answer --help
# Usage:
#   openai-find-textual-answer <text> -q=<Str> [--max-tokens[=UInt]] [-m|--model=<Str>] [-t|--temperature[=Real]] [-r|--request=<Str>] [-p|--pairs] [-a|--auth-key=<Str>] [--timeout[=UInt]] [--format=<Str>] [--method=<Str>] -- Text processing using the OpenAI API.
#   openai-find-textual-answer [<words> ...] -q=<Str> [--max-tokens[=UInt]] [-m|--model=<Str>] [-t|--temperature[=Real]] [-r|--request=<Str>] [-p|--pairs] [-a|--auth-key=<Str>] [--timeout[=UInt]] [--format=<Str>] [--method=<Str>] -- Command given as a sequence of words.
#   
#     <text>                     Text to be processed or audio file name.
#     -q=<Str>                   Questions separated with '?' or ';'.
#     --max-tokens[=UInt]        The maximum number of tokens to generate in the completion. [default: 300]
#     -m|--model=<Str>           Model. [default: 'Whatever']
#     -t|--temperature[=Real]    Temperature. [default: 0.7]
#     -r|--request=<Str>         Request. [default: 'Whatever']
#     -p|--pairs                 Should question-answer pairs be returned or not? [default: False]
#     -a|--auth-key=<Str>        Authorization key (to use OpenAI API.) [default: 'Whatever']
#     --timeout[=UInt]           Timeout. [default: 10]
#     --format=<Str>             Format of the result; one of "json" or "hash". [default: 'json']
#     --method=<Str>             Method for the HTTP POST query; one of "tiny" or "curl". [default: 'tiny']

Refactoring

Separate files for each OpenAI functionality

The original implementation of “WWW::OpenAI” had design and implementation that were very similar to those of “Lingua::Translation::DeepL”, [AAp1].

Major refactoring of the original code was done — now each OpenAI functionality targeted by “WWW::OpenAI” has its code placed in a separate file.

In order to do the refactoring, of course, a comprehensive enough suite of unit tests had to be put in place. Since running the tests costs money, the tests are placed in the “./xt” directory.

De-Cro-ing the requesting code

The first implementation of “WWW::OpenAI” used “Cro::HTTP::Client” to access OpenAI’s services. Often when I use “Cro::HTTP::Client” on macOS I get the errors:

Cannot locate symbol ‘SSL_get1_peer_certificate’ in native library

(See longer discussions about this problem here and here.)

Given the problems of using “Cro::HTTP::Client” and the implementations with curl and “HTTP::Tiny”, I decided it is better to make the implementation of “WWW::OpenAI” more lightweight by removing the code related to “Cro::HTTP::Client”.


References

Articles

[AA1] Anton Antonov, “WWW::OpenAI (for ChatGPT and other statistical gimmicks)”, (2023), RakuForPrediction at WordPress.

[JL1] Jérôme Louradour, “New in the Wolfram Language: FindTextualAnswer”, (2018), blog.wolfram.com.

Packages, platforms

[AAp1] Anton Antonov, Lingua::Translation::DeepL Raku package, (2022), GitHub/antononcube.

[AAp2] Anton Antonov, Text::CodeProcessing, (2021), GitHub/antononcube.

[OAI1] OpenAI Platform, OpenAI platform.

[OAI2] OpenAI Platform, OpenAI documentation.

[OAIp1] OpenAI, OpenAI Python Library, (2020), GitHub/openai.

Videos

[AAv1] Anton Antonov, “Racoons playing with pearls and onions”, (2023), YouTube/@AAA4prediction.

[AAv2] Anton Antonov, “OpenAIMode demo (Mathematica)”, (2023), YouTube/@AAA4prediction.

[AAv3] Anton Antonov, “OpenAIMode code generation workflows demo (Mathematica et al.)”, (2023), YouTube/@AAA4prediction.

[AAv4] Anton Antonov, “Streamlining ChatGPT code generation and narration workflows (Raku)”, (2023), YouTube/@AAA4prediction.

WWW::MermaidInk

The function mermaid-ink of the Raku package “WWW::MermaidInk” gets images corresponding to Mermaid-js specifications via the web Mermaid-ink interface of Mermaid-js.

For a “full set” of examples see the file MermaidInk_woven.html.


Usage

use WWW::MermaidInk
loads the package.

mermaid-ink($spec)
retrieves an image defined by the spec $spec from Mermaid’s Ink Web interface.

mermaid-ink($spec format => 'md-image')
returns a string that is a Markdown image specification in Base64 format.

mermaid-ink($spec file => fileName)
exports the retrieved image into a specified PNG file.

mermaid-ink($spec file => Whatever)
exports the retrieved image into the file $*CMD ~ /out.png.

Details & Options

  • Mermaid lets you create diagrams and visualizations using text and code.
  • Mermaid has different types of diagrams: Flowchart, Sequence Diagram, Class Diagram, State Diagram, Entity Relationship Diagram, User Journey, Gantt, Pie Chart, Requirement Diagram, and others. It is a JavaScript based diagramming and charting tool that renders Markdown-inspired text definitions to create and modify diagrams dynamically.
  • mermaid-ink uses the Mermaid’s functionalities via the Web interface “https://mermaid.ink/img“.
  • The first argument can be a string (that is, a mermaid-js specification) or a list of pairs.
  • The option “directives” can be used to control the layout of Mermaid diagrams if the first argument is a list of pairs.
  • mermaid-ink produces images only.

Examples

Basic Examples

Generate a flowchart from a Mermaid specification:

use WWW::MermaidInk;

'graph TD 
   WL --> |ZMQ|Python --> |ZMQ|WL' 
 ==> mermaid-ink(format=>'md-image')

Create a Markdown image expression from a class diagram:

my $spec = q:to/END/;
classDiagram
    Animal <|-- Duck
    Animal <|-- Fish
    Animal <|-- Zebra
    Animal : +int age
    Animal : +String gender
    Animal: +isMammal()
    Animal: +mate()
    class Duck{
        +String beakColor
        +swim()
        +quack()
    }
    class Fish{
        -int sizeInFeet
        -canEat()
    }
    class Zebra{
        +bool is_wild
        +run()
    }
END

mermaid-ink($spec, format=>'md-image')    

Scope

The first argument can be a list of pairs — the corresponding Mermaid-js graph is produced. Here are the edges of directed graph:

my @edges = ['1' => '3', '3' => '1', '1' => '4', '2' => '3', '2' => '4', '3' => '4'];

[1 => 3 3 => 1 1 => 4 2 => 3 2 => 4 3 => 4]

Here is the corresponding mermaid-js image:

mermaid-ink(@edges, format=>'md-image')


Command Line Interface (CLI)

The package provides the CLI script mermaid-ink. Here is its help message:

mermaid-ink --help

# Usage:
#   mermaid-ink <spec> [-o|--file=<Str>] [--format=<Str>] -- Diagram image for Mermaid-JS spec (via mermaid.ink).
#   mermaid-ink [<words> ...] [-o|--file=<Str>] [--format=<Str>] -- Command given as a sequence of words.
#   
#     <spec>             Mermaid-JS spec.
#     -o|--file=<Str>    File to export the image to. [default: '']
#     --format=<Str>     Format of the result; one of "asis", "base64", "md-image", or "none". [default: 'md-image']


Flowchart

This flowchart summarizes the execution path of obtaining Mermaid images in a Markdown document:


References

Articles

[AA1] Anton Antonov, “Interactive Mermaid diagrams generation via Markdown evaluation”, (2022), RakuForPrediction at WordPress.

Functions and packages

[AAf1] Anton Antonov, MermaidInk Mathematica resource function, (2022-2023), Wolfram Function Repository.

Mermaid resources

DSL::FiniteStateMachines

Introduction

This blog post introduces and exemplifies the Raku package “DSL::FiniteStateMachines” that has class definitions and functions for creation of Finite State Machines (FSMs) and their execution.

This video excerpt, [AAv1], demonstrates the usage workflow of a particular FSM made with that package:


Usage example (Address book)

Here we load the definition of the class AddressBookCaller (provided by this package) and related entities package, “DSL::Entity::AddressBook”:

use DSL::FiniteStateMachines::AddressBookCaller;

use DSL::Entity::AddressBook;
use DSL::Entity::AddressBook::ResourceAccess;

# (Any)

Here we obtain a resource object to access a (particular) address book:

my $resourceObj = DSL::Entity::AddressBook::resource-access-object();

# DSL::Entity::AddressBook::ResourceAccess.new

Here we create the FSM and show its states:

my DSL::FiniteStateMachines::AddressBookCaller $abcFSM .= new;

$abcFSM.make-machine(($resourceObj,));

.say for $abcFSM.states;

# WaitForCallCommand => State object < id => WaitForCallCommand, action => -> $obj { #`(Block|4570571503456) ... } >
# ListOfItems => State object < id => ListOfItems, action => -> $obj { #`(Block|4570571503528) ... } >
# Help => State object < id => Help, action => -> $obj { #`(Block|4570571503600) ... } >
# Exit => State object < id => Exit, action => -> $obj { #`(Block|4570571503672) ... } >
# PrioritizedList => State object < id => PrioritizedList, action => -> $obj { #`(Block|4570571503744) ... } >
# AcquireItem => State object < id => AcquireItem, action => -> $obj { #`(Block|4570571503816) ... } >
# ActOnItem => State object < id => ActOnItem, action => -> $obj { #`(Block|4570571503888) ... } >
# WaitForRequest => State object < id => WaitForRequest, action => -> $obj { #`(Block|4570571503960) ... } >

(Each pair shows the name of the state object and the object itself.)

Here is the graph of FSM’s state transitions:

$abcFSM.to-mermaid-js

Remark: In order to obtain Mathematica — or Wolfram Language (WL) — representation of the state transitions graph the method to-wl can be used.

Here is how the dataset of the created FSM looks like:

.say for $abcFSM.dataset.pick(3);

# {Company => X-Men, DiscordHandle => hugh.jackman#1391, Email => hugh.jackman.523@aol.com, Name => Hugh Jackman, Phone => 940-463-2296, Position => actor}
# {Company => Caribbean Pirates, DiscordHandle => jack.davenport#1324, Email => jack.davenport.152@icloud.net, Name => Jack Davenport, Phone => 627-500-7919, Position => actor}
# {Company => LOTR, DiscordHandle => robert.shaye#6399, Email => robert.shaye.768@gmail.com, Name => Robert Shaye, Phone => 292-252-6866, Position => producer}

For an interactive execution of the FSM we use the command:

#$abcFSM.run('WaitForCallCommand');

Here we run the FSM with a sequence of commands:

$abcFSM.run('WaitForCallCommand', 
        ["call an actor from LOTR", "", 
         "take last three", "", 
         "take the second", "", "", 
         "2", "5", "", 
         "quit"]);

# 🔊 PLEASE enter call request.
# filter by Position is "actor" and Company is "LOTR"
# 🔊 LISTING items.
# ⚙️ListOfItems: Obtained the records:
# ⚙️+--------------+-----------------+--------------------------------+----------+---------+----------------------+
# ⚙️|    Phone     |       Name      |             Email              | Position | Company |    DiscordHandle     |
# ⚙️+--------------+-----------------+--------------------------------+----------+---------+----------------------+
# ⚙️| 408-573-4472 |   Andy Serkis   |   andy.serkis.981@gmail.com    |  actor   |   LOTR  |   andy.serkis#8484   |
# ⚙️| 321-985-9291 |   Elijah Wood   |     elijah.wood.53@aol.com     |  actor   |   LOTR  |   elijah.wood#7282   |
# ⚙️| 298-517-5842 |   Ian McKellen  |    ian.mckellen581@aol.com     |  actor   |   LOTR  |  ian.mckellen#9077   |
# ⚙️| 608-925-5727 |    Liv Tyler    |    liv.tyler1177@gmail.com     |  actor   |   LOTR  |    liv.tyler#8284    |
# ⚙️| 570-406-4260 |  Orlando Bloom  |  orlando.bloom.914@gmail.net   |  actor   |   LOTR  |  orlando.bloom#6219  |
# ⚙️| 365-119-3172 |    Sean Astin   |   sean.astin.1852@gmail.net    |  actor   |   LOTR  |   sean.astin#1753    |
# ⚙️| 287-691-8138 | Viggo Mortensen | viggo.mortensen1293@icloud.com |  actor   |   LOTR  | viggo.mortensen#7157 |
# ⚙️+--------------+-----------------+--------------------------------+----------+---------+----------------------+
# 🔊 PLEASE enter call request.
# 🔊 LISTING items.
# ⚙️ListOfItems: Obtained the records:
# ⚙️+--------------+----------------------+----------+-----------------+---------+--------------------------------+
# ⚙️|    Phone     |    DiscordHandle     | Position |       Name      | Company |             Email              |
# ⚙️+--------------+----------------------+----------+-----------------+---------+--------------------------------+
# ⚙️| 570-406-4260 |  orlando.bloom#6219  |  actor   |  Orlando Bloom  |   LOTR  |  orlando.bloom.914@gmail.net   |
# ⚙️| 365-119-3172 |   sean.astin#1753    |  actor   |    Sean Astin   |   LOTR  |   sean.astin.1852@gmail.net    |
# ⚙️| 287-691-8138 | viggo.mortensen#7157 |  actor   | Viggo Mortensen |   LOTR  | viggo.mortensen1293@icloud.com |
# ⚙️+--------------+----------------------+----------+-----------------+---------+--------------------------------+
# 🔊 PLEASE enter call request.
# 🔊 LISTING items.
# ⚙️ListOfItems: Obtained the records:
# ⚙️+---------+-----------------+------------+--------------+----------+---------------------------+
# ⚙️| Company |  DiscordHandle  |    Name    |    Phone     | Position |           Email           |
# ⚙️+---------+-----------------+------------+--------------+----------+---------------------------+
# ⚙️|   LOTR  | sean.astin#1753 | Sean Astin | 365-119-3172 |  actor   | sean.astin.1852@gmail.net |
# ⚙️+---------+-----------------+------------+--------------+----------+---------------------------+
# 🔊 ACQUIRE item: {Company => LOTR, DiscordHandle => sean.astin#1753, Email => sean.astin.1852@gmail.net, Name => Sean Astin, Phone => 365-119-3172, Position => actor}
# ⚙️Acquiring contact info for : ⚙️Sean Astin
# 🔊 ACT ON item: {Company => LOTR, DiscordHandle => sean.astin#1753, Email => sean.astin.1852@gmail.net, Name => Sean Astin, Phone => 365-119-3172, Position => actor}
# ⚙️[1] email, [2] phone message, [3] phone call, [4] discord message, or [5] nothing
# ⚙️(choose one...)
# ⚙️message by phone 365-119-3172
# 🔊 ACT ON item: {Company => LOTR, DiscordHandle => sean.astin#1753, Email => sean.astin.1852@gmail.net, Name => Sean Astin, Phone => 365-119-3172, Position => actor}
# ⚙️[1] email, [2] phone message, [3] phone call, [4] discord message, or [5] nothing
# ⚙️(choose one...)
# ⚙️do nothing
# 🔊 SHUTTING down...


Object Oriented Design

Here is the Unified Modeling Language (UML) diagram corresponding to the classes in this package:

(The UML spec and the Mermaid spec above were automatically generated with “UML::Translators”, [AAp5].)

Here is the MermaidJS spec generation shell command:

to-uml-spec --format=MermaidJS DSL::FiniteStateMachines 


References

Packages

[AAp1] Anton Antonov, DSL::Shared Raku package, (2020), GitHub/antononcube.

[AAp2] Anton Antonov, DSL::Entity::Metadata Raku package, (2021), GitHub/antononcube.

[AAp3] Anton Antonov, DSL::English::DataAcquisitionWorkflows Raku package, (2021), GitHub/antononcube.

[AAp4] Anton Antonov, DSL::Entity::AddressBook Raku package, (2023), GitHub/antononcube.

[AAp5] Anton Antonov, UML::Translators Raku package, (2021), GitHub/antononcube.

Videos

[AAv1] Anton Antonov, “Multi-language Data Wrangling and Acquisition Conversational Agents (in Raku)”, (2021), YouTube.com.

Racoons playing with pearls and onions

This blog post consists of the slides of the presentation “Racoons playing with pearls and onions”:


Image generation with OpenAI

… using the Raku module “WWW::OpenAI”

Anton Antonov

 RakuForPrediction-book at GitHub 

March 2023


Presentation plan

Warm-up with text generation

  • ChatGPT

Image generations

  • URLs
  • Base64s

Using Comma IDE and CLI

Other examples


Setup

Load the package

use WWW::OpenAI;

(*"(Any)"*)

Get the authorization key

RakuInputExecute["my $auth-key='" <> AUTHKEY <> "'"];


Warm up (completions)

my @txtRes = |openai-completion(
	'which is the most successful programming language', 
	n=>3, 
	temperature=>1.3, 
	max-tokens=>120, 
	format=>'values',	
	:$auth-key);
	
@txtRes.elems

(*"3"*)

RakuInputExecute["@txtRes==>encode-to-wl"] // ResourceFunction["GridTableForm"]

0c03biewnirbn

Image generation (URLs)

my @imgRes = |openai-create-image(
	'Racoon with pearls in the style Raphael', 
	 n=>3,  
	 response-format=>'url',
	 format=>'values',
	 :$auth-key);
	
@imgRes.elems;

(*"3"*)

Magnify[Import[#], 2.5] & /@ RakuInputExecute["@imgRes==>encode-to-wl"]

0or0u0a9brz0s
@imgRes


Image generation (Base64s)

my @imgRes2 = |openai-create-image(
	'Racoon and sliced onion in the style Rene Magritte',  
	n=>3,
	response-format=>'b64_json', 
	format=>'values',	
	:$auth-key);
	
@imgRes2.elems;

(*"3"*)

Magnify[ImportByteArray[BaseDecode[#]], 2.5] & /@ RakuInputExecute["@imgRes2==>encode-to-wl"]

1idcr2434outd
@imgRes2

WebImageSearch["Son of Man Magritte", 3]

0cwgyyfdsuev6

Using Comma IDE and CLI

“Text::CodeProcessing” used in Mathematica and executable Markdown


Some more

Monet

my @imgRes3 = |openai-create-image('Racoons playing onions and perls in the style Hannah Wilke', n=>3, size => 'medium', format=>'values', response-format=>'b64_json', :$auth-key);
@imgRes3.elems;

(*"3"*)

Magnify[ImportByteArray[BaseDecode[#]], 1.5] & /@ RakuInputExecute["@imgRes3==>encode-to-wl"]

07sago6o0y5mv
openai-create-image('Racoons playing onions and perls in the style Monet', n=>3, size => 'largers')

(*"#ERROR: The argument $size is expected to be Whatever or one of '1024x1024, 256x256, 512x512, large, medium, small'.Nil"*)

Helmut Newton

my @imgRes3 = |openai-create-image('Racoon in the style Helmut Newton', n=>3, size => 'small', format=>'values', response-format=>'b64_json', :$auth-key);
@imgRes3.elems;

(*"3"*)

Magnify[ImportByteArray[BaseDecode[#]], 1.5] & /@ RakuInputExecute["@imgRes3==>encode-to-wl"]

1bg8etqj40dss

Hieronymus Bosch

my @imgRes3 = |openai-create-image('how we live now in the style of Hieronymus Bosch', n=>3, size => 'small', format=>'values', response-format=>'b64_json', :$auth-key);
@imgRes3.elems;

(*"3"*)

Magnify[ImportByteArray[BaseDecode[#]], 1.5] & /@ RakuInputExecute["@imgRes3==>encode-to-wl"]

1a1lqkfo1uzkh

Inkblots

my @imgRes4 = |openai-create-image('Racoon in the style Roschach inkblot', :$auth-key,  n=>6,  format=>'values');
@imgRes4.elems;

(*"6"*)

Magnify[Import[#], 1.5] & /@ RakuInputExecute["@imgRes4==>encode-to-wl"]

10whvhc8v7idl

Saved

1rrswxq91jawh

Alternative

Table[ResourceFunction["RandomRorschach"]["NumberOfStrokes" -> RandomChoice[{8, 12, 20}, RandomInteger[{1, 4}]], ColorFunction -> (GrayLevel[RandomReal[{0.1, 0.2}]] &), "ImageEffects" -> {{"Jitter", 16}, {"OilPainting", 6}}], 12]

0u2f1799sicyj

Goldberg machines

my @imgRes5 = |openai-create-image('Camels in a Rube Goldberg machine', :$auth-key,  n=>6,  format=>'values');
@imgRes5.elems;

(*"6"*)

Magnify[Import[#], 1.5] & /@ RakuInputExecute["@imgRes5==>encode-to-wl"]

0j12jz88emd58

WWW::OpenAI (for ChatGPT and other statistical gimmicks)

In brief

The Raku package “WWW::OpenAI” provides access to the machine learning service OpenAI, [OAI1]. For more details of the OpenAI’s API usage see the documentation, [OAI2].

Remark: To use the OpenAI API one has to register and obtain authorization key.

Remark: This Raku package is much “less ambitious” than the official Python package, [OAIp1], developed by OpenAI’s team. Gradually, over time, I expect to add features to the Raku package that correspond to features of [OAIp1].

The design and implementation of “WWW::OpenAI” are very similar to those of “Lingua::Translation::DeepL”, [AAp1].


Installation

Package installations from both sources use zef installer (which should be bundled with the “standard” Rakudo installation file.)

To install the package from Zef ecosystem use the shell command:

zef install WWW::OpenAI

To install the package from the GitHub repository use the shell command:

zef install https://github.com/antononcube/Raku-WWW-OpenAI.git


Usage examples

Remark: When the authorization key, auth-key, is specified to be Whatever then openai-playground attempts to use the env variable OPENAI_API_KEY.

Basic usage

Here is a simple call:

use WWW::OpenAI;
say openai-playground('Where is Roger Rabbit?');

# [{finish_reason => stop, index => 0, message => {content => 
# 
# As an AI language model, I do not have access to real-time information or location tracking features. Therefore, I cannot provide an accurate answer to the question "Where is Roger Rabbit?" without additional context. However, if you are referring to the fictional character Roger Rabbit, he is a cartoon character created by Disney and is typically found in various media, including films, television shows, and comic books., role => assistant}}]

Another one using Bulgarian:

say openai-playground('Колко групи могат да се намерят в този облак от точки.');

# [{finish_reason => stop, index => 0, message => {content => 
# 
# Като асистент на AI, не мога да видя облак от точки, за да мога да дам точен отговор на този въпрос. Моля, предоставете повече информация или конкретен пример, за да мога да ви помогна., role => assistant}}]


Command Line Interface

The package provides a Command Line Interface (CLI) script:

openai-playground --help

# Usage:
#   openai-playground <text> [-m|--model=<Str>] [-r|--role=<Str>] [-t|--temperature[=Real]] [-a|--auth-key=<Str>] [--timeout[=UInt]] [--format=<Str>] -- Text processing using the OpenAI API.
#   
#     <text>                     Text to be processed.
#     -m|--model=<Str>           Model. [default: 'Whatever']
#     -r|--role=<Str>            Role. [default: 'user']
#     -t|--temperature[=Real]    Temperature. [default: 0.7]
#     -a|--auth-key=<Str>        Authorization key (to use OpenAI API.) [default: 'Whatever']
#     --timeout[=UInt]           Timeout. [default: 10]
#     --format=<Str>             Format of the result; one of "json" or "hash". [default: 'json']

Remark: When the authorization key argument “auth-key” is specified set to “Whatever” then openai-playground attempts to use the env variable OPENAI_API_KEY.


Mermaid diagram

The following flowchart corresponds to the steps in the package function openai-playground:


References

[AAp1] Anton Antonov, Lingua::Translation::DeepL Raku package, (2022), GitHub/antononcube.

[OAI1] OpenAI Platform, OpenAI platform.

[OAI1] OpenAI Platform, OpenAI documentation.

[OAIp1] OpenAI, OpenAI Python Library, (2020), GitHub/openai.

Literate programming via CLI

In this blog post we demonstrate how to do “Literate programming” in Raku via (short) pipelines of Command Line Interface (CLI) scripts.

An alternative of this CLI-based process is to use Mathematica or Jupyter notebooks.

Presentation’s GitHub directory

Video recording


Steps and components

Here is a narration of the diagram above:

  1. Write some text and code in a Markdown file
  2. Weave the Markdown file (i.e. “run it”)
  3. If the woven file the does not have D3.js graphics go to 5
  4. Convert the woven Markdown file into HTML
  5. Examine results
  6. Go to 1
    • Or finish “fiddling with it”

The conversions

The we use the document “Cryptocurrencies-explorations.md” (with Raku code over cryptocurrencies data.)

If no D3.js graphics are specified then we can use the shell command:

file-code-chunks-eval Cryptocurrencies-explorations.md

If D3.js graphics are specified then we can use the shell command:

file-code-chunks-eval Cryptocurrencies-explorations.md && 
  from-markdown Cryptocurrencies-explorations_woven.md -t html -o out.html && 
  open out.html

Remark: It is instructive to examine “Cryptocurrencies-explorations_woven.md” and compare it with “Cryptocurrencies-explorations.md”.

Remark The code chunks with graphics (using “JavaScript::D3”) have to have the chunk option setting results=asis.

Remark The “JavaScript::D3” commands have to have the option settings format => 'html' and div-id => ....


References

Articles

[AA1] Anton Antonov “Raku Text::CodeProcessing”, (2021), RakuForPrediction at WordPress.

[AA2] Anton Antonov “JavaScript::D3”, (2022), RakuForPrediction at WordPress.

[AA3] Anton Antonov “Further work on the Raku-D3.js translation”, (2022), RakuForPrediction at WordPress.

Packages

[AAp1] Anton Antonov, Data::Cryptocurrencies Raku package, (2023). GitHub/antononcube.

[AAp2] Anton Antonov, JavaScript::D3 Raku package, (2022). GitHub/antononcube.

[AAp3] Anton Antonov, Text::CodeProcessing Raku package, (2021). GitHub/antononcube.

[AAp4] Anton Antonov, Markdown::Grammar, (2022). GitHub/antononcube.

Gherkin::Grammar

This blog post introduces and briefly describes the Raku package “Gherkin::Grammar” for Gherkin test specifications parsing and interpretations.

Gherkin is the language of the Cucumber framework, [Wk1], that is used to do Behavior-Driven Development (BDD), [Wk2].

The Raku package “Cucumis Sextus, [RL1], aims to provide a “full-blown” specification-and-execution framework in Raku like the typical Cucumber functionalities in other languages. (Ruby, Java, etc.)

This package, “Gherkin::Grammar” takes a minimalist perspective; it aims to provide:

  • Grammar (and roles) for parsing Gherkin specifications
  • Test file template generation

Having a “standalone” Gherkin grammar (or role) facilitates the creation and execution of general or specialized frameworks for Raku support of BDD.

The package provides the functions:

  • gherkin-parse
  • gherkin-subparse
  • gherkin-interpret

The Raku outputs of gherkin-interpret are test file templates that after filling-in would provide tests that correspond to the input specifications.

Remark: A good introduction to the Cucumber / Gherkin approach and workflows is the README of [RLp1].

Remark: The grammar in this package was programmed following the specifications and explanations in Gherkin Reference.


Installation

From Zef ecosystem:

zef install Gherkin::Grammar

From GitHub:

zef install https://github.com/antononcube/Raku-Gherkin-Grammar


Workflow

The package follows the general Cucumber workflow, but some elements are less automated. Here is a flowchart:

Here is corresponding narration:

  1. Write tests using Gherkin specs
  2. Generate test code
    • Using the package “Gherkin::Grammar”.
  3. Fill-in the code of step functions
  4. Execute tests
  5. Revisit (refine) steps 1 and/or 4 as needed
  6. Integrate resulting test file

Remark: See the Cucumber framework flowchart in the files Flowcharts.md.


Usage examples

Here is a basic (and short) Gherkin spec interpretation example:

use Gherkin::Grammar;

my $text0 = q:to/END/;
Feature: Calculation
    Example: One plus one
    When 1 + 1
    Then 2
END

gherkin-interpret($text0);

# use v6.d;
# 
# #============================================================
# 
# proto sub Background($descr) {*}
# proto sub ScenarioOutline(@cmdFuncPairs) {*}
# proto sub Example($descr) {*}
# proto sub Given(Str:D $cmd, |) {*}
# proto sub When(Str:D $cmd, |) {*}
# proto sub Then(Str:D $cmd, |) {*}
# 
# #============================================================
# 
# use Test;
# plan *;
# 
# #============================================================
# # Example : One plus one
# #------------------------------------------------------------
# 
# multi sub When( $cmd where * eq '1 + 1' ) {}
# 
# multi sub Then( $cmd where * eq '2' ) {}
# 
# multi sub Example('One plus one') {
# 	When( '1 + 1' );
# 	Then( '2' );
# }
# 
# is Example('One plus one'), True, 'One plus one';
# 
# done-testing;

Internationalization

The package provides internationalization using different languages. The (initial) internationalization keyword-regexes data structure was taken from [RLp1]. (See the file “I18n.rakumod”.)

Here is an example with Russian:

my $ru-text = q:to/END/;
Функционал: Вычисление
    Пример: одно плюс одно
    Когда 1 + 1
    Тогда 2
END

gherkin-interpret($ru-text, lang => 'Russian');

# use v6.d;
# 
# #============================================================
# 
# proto sub Background($descr) {*}
# proto sub ScenarioOutline(@cmdFuncPairs) {*}
# proto sub Example($descr) {*}
# proto sub Given(Str:D $cmd, |) {*}
# proto sub When(Str:D $cmd, |) {*}
# proto sub Then(Str:D $cmd, |) {*}
# 
# #============================================================
# 
# use Test;
# plan *;
# 
# #============================================================
# # Example : одно плюс одно
# #------------------------------------------------------------
# 
# multi sub When( $cmd where * eq '1 + 1' ) {}
# 
# multi sub Then( $cmd where * eq '2' ) {}
# 
# multi sub Example('одно плюс одно') {
# 	When( '1 + 1' );
# 	Then( '2' );
# }
# 
# is Example('одно плюс одно'), True, 'одно плюс одно';
# 
# done-testing;

Doc-string Arguments

The package takes both doc-strings and tables as step arguments.

Doc-strings are put between lines with triple quotes; the text between the quotes is given as second argument of the corresponding step function.

Here is an example of a Gherkin specification for testing a data wrangling Domain Specific Language (DSL) parser-interpreter, [AA1, AAp2], that uses doc-string:

Feature: Data wrangling DSL pipeline testing

  Scenario: Long pipeline
    Given target is Raku
    And titanic dataset exists
    When is executed the pipeline:
      """
      use @dsTitanic;
      filter by passengerSurvival is "survived";
      cross tabulate passengerSex vs passengerClass
      """
    Then result is a hash

That specification is part of the Gherkin file: “DSL-for-data-wrangling.feature”.

The corresponding code generated by “Gherkin::Grammar” is given in the file: “DSL-for-data-wrangling-generated.rakutest”.

The fill-in definitions of the corresponding functions are given in the file: “DSL-for-data-wrangling.rakutest”.

Table arguments

The package handles tables as step arguments. The table arguments are treated differently in Example or Scenario blocks than in Scenario outline blocks.

Here is a “simple” use of a table:

Feature: DateTime parsing tests

  Scenario: Simple
    When today, yesterday, tomorrow
    Then the results adhere to:
      | Spec      | Result                        |
      | today     | DateTime.today                |
      | yesterday | DateTime.today.earlier(:1day) |
      | tomorrow  | DateTime.today.later(:1day)   |

Here is a Scenario Outline spec:

Feature: DateTime parsing tests 2

   Scenario Outline: Repeated
      Given <Spec>
      Then <Result>
      Examples: the results adhere to:
         | Spec      | Result                        |
         | today     | DateTime.today                |
         | yesterday | DateTime.today.earlier(:1day) |
         | tomorrow  | DateTime.today.later(:1day)   |

Remark: The package “Markdown::Grammar”, [AAp1], parses tables in a similar manner, but [AAp1] assumes that a table field can have plain words, words with slant or weight, or hyperlinks.

Remark: The package [AAp1] parses tables with- and without headers. The Gherkin language descriptions and examples I have seen did not have tables with header separators. Hence, a header separator is treated as a regular table row in “Gherkin::Grammar”.


Complete examples

Calculator

The files Calculator.feature” and “Calculator.rakutest” provide a simple, fully worked example of how this package can be used to implement Cucumber framework workflows.

Remark: The Cucumber framework(s) expect Gherkin test specifications to be written in files with extension “.feature”.

DateTime interpretation

The date-time interpretations of the package “DateTime::Grammar”, [AAp3], are tested with the feature file “DateTime-interpretation.feature” (and the related “*.rakutest” files.)

Numeric word forms parsing

The interpretations of numeric word forms into number of the package “Lingua::NumericWordForms”, [AAp4], are tested with the feature file “Numeric-word-forms-parsing.feature” (and the related “*.rakutest” files.)

DSL for data wrangling

The data wrangling translations and execution results of the package “DSL::English::DataQueryWorkflows”, [AA1, AAp2], are tested with the feature file “DSL-for-data-wrangling.feature” (and the related “*.rakutest” files.)

This is a fairly non-trivial examples that involves multiple packages. Also, it makes a lot of sense to test DSL translators using a testing DSL (like Gherkin.)


CLI

The package provides a Command Line Interface (CLI) script. Here is its help message:

gherkin-interpretation --help

# Usage:
#   gherkin-interpretation <fileName> [-l|--from-lang=<Str>] [-t|--to-lang=<Str>] [-o|--output=<Str>] -- Interprets Gherkin specifications.
#   
#     -l|--from-lang=<Str>    Natural language in which the feature specification is written in. [default: 'English']
#     -t|--to-lang=<Str>      Language to interpret (translate) the specification to. [default: 'Raku']
#     -o|--output=<Str>       File to place the interpretation to. (If '-' stdout is used.) [default: '-']


References

Articles

[AA1] Anton Antonov, “Introduction to data wrangling with Raku” , (2021), RakuForPrediction at WordPress.

[SB1] SmartBear, “Gherkin Reference”, (2023), cucumber.io.

[Wk1] Wikipedia entry, “Cucumber (software)”. See also cucumber.io.

[Wk2] Wikipedia entry, “Behavior-driven development”.

Packages

[AAp1] Anton Antonov, Markdown::Grammar Raku package, (2022-2023), GitHub/antononcube.

[AAp2] Anton Antonov, DSL::English::DataQueryWorkflows Raku package, (2021-2023), GitHub/antononcube.

[AAp3] Anton Antonov, DateTime::Grammar Raku package, (2023), GitHub/antononcube.

[AAp4] Anton Antonov, Lingua::NumericWordForms Raku package, (2021-2023), GitHub/antononcube.

[RLp1] Robert Lemmen, Cucumis Sextus Raku package, (2017-2020), GitHub/robertlemmen.

Data::Cryptocurrencies

The Raku package “Data::Cryptocurrencies” has functions for cryptocurrency data retrieval. (At this point, only Yahoo Finance is used as a data source.)

The implementation follows the Mathematica implementation in [AAf1] described in [AA1]. (Further explorations are discussed in [AA2].)


Installation

From Zef ecosystem:

zef install Data::Cryptocurrencies

From GitHub:

zef install https://github.com/antononcube/Raku-Data-Cryptocurrencies.git


Usage examples

Here we get Bitcoin (BTC) data from 1/1/2020 until now:

use Data::Cryptocurrencies;
use Data::Summarizers;
use Text::Plot;

my @ts = cryptocurrency-data('BTC', dates => (DateTime.new(2020, 1, 1, 0, 0, 0), now), props => <DateTime Close>,
        format => 'dataset'):!cache-all;

say @ts.elems;

# 1137

When we request the data to be returned as “dataset” then the result is an array of hashes. When we request the data to be returned as “timeseries” the result is an array of pairs (sorted by date.) 

Here are BTC values for the last week (at the point of retrieval):

.say for @ts.tail(7)

# {Close => 23331.847656, DateTime => 2023-02-04T00:00:00Z}
# {Close => 22955.666016, DateTime => 2023-02-05T00:00:00Z}
# {Close => 22760.109375, DateTime => 2023-02-06T00:00:00Z}
# {Close => 23264.291016, DateTime => 2023-02-07T00:00:00Z}
# {Close => 22939.398438, DateTime => 2023-02-08T00:00:00Z}
# {Close => null, DateTime => 2023-02-09T00:00:00Z}
# {Close => 21863.925781, DateTime => 2023-02-10T00:00:00Z}

Here is a summary:

records-summary(@ts, field-names => <DateTime Close>);

# +--------------------------------+----------------------+
# | DateTime                       | Close                |
# +--------------------------------+----------------------+
# | Min    => 2020-01-01T00:00:37Z | 8755.246094  => 1    |
# | 1st-Qu => 2020-10-10T12:00:37Z | 45585.03125  => 1    |
# | Mean   => 2021-07-22T00:00:37Z | 55888.132813 => 1    |
# | Median => 2021-07-22T00:00:37Z | 29098.910156 => 1    |
# | 3rd-Qu => 2022-05-02T12:00:37Z | 20471.482422 => 1    |
# | Max    => 2023-02-10T00:00:37Z | 21395.019531 => 1    |
# |                                | 53555.109375 => 1    |
# |                                | (Other)      => 1130 |
# +--------------------------------+----------------------+

Clean data:

@ts = @ts.grep({ $_<Close> ~~ Numeric }).Array;
say @ts.elems;

# 1136

Here is a text-based plot of the corresponding time series:

say text-list-plot(@ts.map(*<DateTime>.Instant.Int).List, @ts.map(*<Close>).List, width => 100, height => 20);

# +------+-----------------+-----------------+-----------------+-----------------+-----------------+-+          
# +                                                                                                  +  70000.00
# |                                                       * *                                        |          
# |                                      *  *             ***                                        |          
# +                                     *******           *****                                      +  60000.00
# |                                     *******        * **  **                                      |          
# +                                    **** ***       ** *    ***                                    +  50000.00
# |                                    **     *      *****    ***  *   *                             |          
# |                                 *         **    **  **      ** *******                           |          
# +                                 ****      ****  *             ***** ***                          +  40000.00
# |                                 ***        ******             *       *                          |          
# +                                 ***          * *                      ***                        +  30000.00
# |                                **                                        *   *                   |          
# |                                *                                         ********* **    ***     |          
# +                             ***                                          ***  ************       +  20000.00
# |                      *    ***                                                       **           |          
# +     ******   **************                                                                      +  10000.00
# |    **    *****                                                                                   |          
# +                                                                                                  +      0.00
# +------+-----------------+-----------------+-----------------+-----------------+-----------------+-+          
#        1580000000.00     1600000000.00     1620000000.00     1640000000.00     1660000000.00     1680000000.00

Instead of the text plot above we can use “JavaScript::D3”:


Data caching

Since the downloading of the cryptocurrencies data can be time consuming ( close to 1 minute) it is a good idea to do data caching.

Data caching is “triggered” with the adverb :cache-all. Here is an example:

cryptocurrency-data('BTC', dates => 'today'):cache-all;

By default no data caching is done (i.e. :!cache-all.) The data is stored in the $XDG_DATA_HOME directory. See [JS1]. Every day new data is obtained.

Remark: The use of cached data greatly speeds up the cryptocurrencies explorations with this package.


CLI

The package provides a Command Line Interface (CLI) script. Here is its usage message:

cryptocurrency-data --help

# Usage:
#   cryptocurrency-data [<symbol>] [-p|--properties=<Str>] [--start-date=<Str>] [--end-date=<Str>] [-c|--currency=<Str>] [--format=<Str>] -- Retrieves cryptocurrency data.
#   
#     [<symbol>]               Cryptocurrency symbol. [default: 'BTC']
#     -p|--properties=<Str>    Properties to retrieve. [default: 'all']
#     --start-date=<Str>       Start date. [default: 'auto']
#     --end-date=<Str>         End date. [default: 'now']
#     -c|--currency=<Str>      Currency. [default: 'USD']
#     --format=<Str>           Format of the result [default: 'json']


Additional usage examples

The notebook “Cryptocurrency-explorations.ipynb” provides additional usage examples using D3.js plots. (It loosely follows [AA2].)


References

Articles

[AA1] Anton Antonov “Crypto-currencies data acquisition with visualization” , (2021), MathematicaForPrediction at WordPress.

[AA2] Anton Antonov “Cryptocurrencies data explorations” , (2021), MathematicaForPrediction at WordPress.

Functions, packages

[AAf1] Anton Antonov, CryptocurrencyData Mathematica resource function, (2021), WolframCloud/antononcube.

[AAp1] Anton Antonov, Data::Summarizers Raku package, (2021-2023), GitHub/antononcube.

[AAp2] Anton Antonov, Text::Plot Raku package, (2021), GitHub/antononcube.

[JS1] Jonathan Stowe, XDG::BaseDirectory Raku package, (2016-2023), Zef-ecosystem/jonathanstowe.

DateTime::Grammar

Introduction

The Raku package “DateTime::Grammar provides grammar (role) and interpreters for parsing datetime specifications.

Most of the code is from [FS1]. The original code of the file “Parse.rakumod” (from [FS1]) was separated into the files “Grammarish.rakumod” and “Actions/Raku.rakumod”.

The code in “Grammar.rakumod” provides the “top-level” functions:

  • datetime-parse
  • datetime-subparse
  • datetime-interpret

Remark: The code DateTime::Parse.new can be replaced with datetime-interpret. Compare the test files of the “DateTime::Grammar” repository that have names starting with “01-” with the corresponding files in [FS1].


Installation

From Zef ecosystem:

zef install DateTime::Grammar

From GitHub:

zef install https://github.com/antononcube/Raku-DateTime-Grammar.git


Usage examples

Interpret a full blown datetime spec:

use DateTime::Grammar;
my $rfc1123 = datetime-interpret('Sun, 06 Nov 1994 08:49:37 GMT');
$rfc1123.raku

# DateTime.new(1994,11,6,8,49,37)

Just the date:

$rfc1123.Date;

# 1994-11-06

7th day of week:

datetime-interpret('Sun', :rule<wkday>) + 1;

# 7

With the adverb extended we can control whether the datetime specs can be just dates. Here are examples:

datetime-interpret('1/23/1089'):extended;

# 1089-01-23T00:00:00Z

datetime-interpret('1/23/1089'):!extended;

# (Any)


Using the role in “external” grammars

Here is how the role “Grammarish” of “DateTime::Grammar” can be used in “higher order” grammars:

my grammar DateTimeInterval 
    does DateTime::Grammarish {

    rule TOP($*extended) { 'from' <from=.datetime-param-spec> 'to' <to=.datetime-param-spec> } 
};

DateTimeInterval.parse('from 2022-12-02 to Oct 4 2023', args => (True,))

# 「from 2022-12-02 to Oct 4 2023」
#  from => 「2022-12-02」
#   date-spec => 「2022-12-02」
#    date5 => 「2022-12-02」
#     year => 「2022」
#     month => 「12」
#     day => 「02」
#  to => 「Oct 4 2023」
#   date-spec => 「Oct 4 2023」
#    date8 => 「Oct 4 2023」
#     month => 「Oct」
#      month-short-name => 「Oct」
#     day => 「4」
#     year => 「2023」

The parameter $*extended can be eliminated by using <datetime-spec> instead of <datetime-param-spec>:

my grammar DateTimeInterval2 
    does DateTime::Grammarish {

    rule TOP { 'from' <from=.datetime-spec> 'to' <to=.datetime-spec> } 
};

DateTimeInterval2.parse('from 2022-12-02 to Oct 4 2023')

# 「from 2022-12-02 to Oct 4 2023」
#  from => 「2022-12-02」
#   date-spec => 「2022-12-02」
#    date5 => 「2022-12-02」
#     year => 「2022」
#     month => 「12」
#     day => 「02」
#  to => 「Oct 4 2023」
#   date-spec => 「Oct 4 2023」
#    date8 => 「Oct 4 2023」
#     month => 「Oct」
#      month-short-name => 「Oct」
#     day => 「4」
#     year => 「2023」

CLI

The package provides a Command Line Interface (CLI) script. Here is its usage message:

datetime-interpretation --help

# Usage:
#   datetime-interpretation <spec> [-t|--target=<Str>] -- Interpret datetime spec.
#   datetime-interpretation [<words> ...] [-t|--target=<Str>] -- Interpret datetime spec obtained by a sequence of strings.
#   datetime-interpretation [-t|--target=<Str>] -- Interpret datetime spec from pipeline input
#   
#     <spec>               Datetime specification.
#     -t|--target=<Str>    Interpretation target. [default: 'Raku']
#     [<words> ...]        Datetime specification.


References

[FPS1] Filip Sergot, DateTime::Parse Raku package, (2017-2022), GitHub/sergot.