WWW::MermaidInk

The function mermaid-ink of the Raku package “WWW::MermaidInk” gets images corresponding to Mermaid-js specifications via the web Mermaid-ink interface of Mermaid-js.

For a “full set” of examples see the file MermaidInk_woven.html.


Usage

use WWW::MermaidInk
loads the package.

mermaid-ink($spec)
retrieves an image defined by the spec $spec from Mermaid’s Ink Web interface.

mermaid-ink($spec format => 'md-image')
returns a string that is a Markdown image specification in Base64 format.

mermaid-ink($spec file => fileName)
exports the retrieved image into a specified PNG file.

mermaid-ink($spec file => Whatever)
exports the retrieved image into the file $*CMD ~ /out.png.

Details & Options

  • Mermaid lets you create diagrams and visualizations using text and code.
  • Mermaid has different types of diagrams: Flowchart, Sequence Diagram, Class Diagram, State Diagram, Entity Relationship Diagram, User Journey, Gantt, Pie Chart, Requirement Diagram, and others. It is a JavaScript based diagramming and charting tool that renders Markdown-inspired text definitions to create and modify diagrams dynamically.
  • mermaid-ink uses the Mermaid’s functionalities via the Web interface “https://mermaid.ink/img“.
  • The first argument can be a string (that is, a mermaid-js specification) or a list of pairs.
  • The option “directives” can be used to control the layout of Mermaid diagrams if the first argument is a list of pairs.
  • mermaid-ink produces images only.

Examples

Basic Examples

Generate a flowchart from a Mermaid specification:

use WWW::MermaidInk;

'graph TD 
   WL --> |ZMQ|Python --> |ZMQ|WL' 
 ==> mermaid-ink(format=>'md-image')

Create a Markdown image expression from a class diagram:

my $spec = q:to/END/;
classDiagram
    Animal <|-- Duck
    Animal <|-- Fish
    Animal <|-- Zebra
    Animal : +int age
    Animal : +String gender
    Animal: +isMammal()
    Animal: +mate()
    class Duck{
        +String beakColor
        +swim()
        +quack()
    }
    class Fish{
        -int sizeInFeet
        -canEat()
    }
    class Zebra{
        +bool is_wild
        +run()
    }
END

mermaid-ink($spec, format=>'md-image')    

Scope

The first argument can be a list of pairs — the corresponding Mermaid-js graph is produced. Here are the edges of directed graph:

my @edges = ['1' => '3', '3' => '1', '1' => '4', '2' => '3', '2' => '4', '3' => '4'];

[1 => 3 3 => 1 1 => 4 2 => 3 2 => 4 3 => 4]

Here is the corresponding mermaid-js image:

mermaid-ink(@edges, format=>'md-image')


Command Line Interface (CLI)

The package provides the CLI script mermaid-ink. Here is its help message:

mermaid-ink --help

# Usage:
#   mermaid-ink <spec> [-o|--file=<Str>] [--format=<Str>] -- Diagram image for Mermaid-JS spec (via mermaid.ink).
#   mermaid-ink [<words> ...] [-o|--file=<Str>] [--format=<Str>] -- Command given as a sequence of words.
#   
#     <spec>             Mermaid-JS spec.
#     -o|--file=<Str>    File to export the image to. [default: '']
#     --format=<Str>     Format of the result; one of "asis", "base64", "md-image", or "none". [default: 'md-image']


Flowchart

This flowchart summarizes the execution path of obtaining Mermaid images in a Markdown document:


References

Articles

[AA1] Anton Antonov, “Interactive Mermaid diagrams generation via Markdown evaluation”, (2022), RakuForPrediction at WordPress.

Functions and packages

[AAf1] Anton Antonov, MermaidInk Mathematica resource function, (2022-2023), Wolfram Function Repository.

Mermaid resources

Racoons playing with pearls and onions

This blog post consists of the slides of the presentation “Racoons playing with pearls and onions”:


Image generation with OpenAI

… using the Raku module “WWW::OpenAI”

Anton Antonov

 RakuForPrediction-book at GitHub 

March 2023


Presentation plan

Warm-up with text generation

  • ChatGPT

Image generations

  • URLs
  • Base64s

Using Comma IDE and CLI

Other examples


Setup

Load the package

use WWW::OpenAI;

(*"(Any)"*)

Get the authorization key

RakuInputExecute["my $auth-key='" <> AUTHKEY <> "'"];


Warm up (completions)

my @txtRes = |openai-completion(
	'which is the most successful programming language', 
	n=>3, 
	temperature=>1.3, 
	max-tokens=>120, 
	format=>'values',	
	:$auth-key);
	
@txtRes.elems

(*"3"*)

RakuInputExecute["@txtRes==>encode-to-wl"] // ResourceFunction["GridTableForm"]

0c03biewnirbn

Image generation (URLs)

my @imgRes = |openai-create-image(
	'Racoon with pearls in the style Raphael', 
	 n=>3,  
	 response-format=>'url',
	 format=>'values',
	 :$auth-key);
	
@imgRes.elems;

(*"3"*)

Magnify[Import[#], 2.5] & /@ RakuInputExecute["@imgRes==>encode-to-wl"]

0or0u0a9brz0s
@imgRes


Image generation (Base64s)

my @imgRes2 = |openai-create-image(
	'Racoon and sliced onion in the style Rene Magritte',  
	n=>3,
	response-format=>'b64_json', 
	format=>'values',	
	:$auth-key);
	
@imgRes2.elems;

(*"3"*)

Magnify[ImportByteArray[BaseDecode[#]], 2.5] & /@ RakuInputExecute["@imgRes2==>encode-to-wl"]

1idcr2434outd
@imgRes2

WebImageSearch["Son of Man Magritte", 3]

0cwgyyfdsuev6

Using Comma IDE and CLI

“Text::CodeProcessing” used in Mathematica and executable Markdown


Some more

Monet

my @imgRes3 = |openai-create-image('Racoons playing onions and perls in the style Hannah Wilke', n=>3, size => 'medium', format=>'values', response-format=>'b64_json', :$auth-key);
@imgRes3.elems;

(*"3"*)

Magnify[ImportByteArray[BaseDecode[#]], 1.5] & /@ RakuInputExecute["@imgRes3==>encode-to-wl"]

07sago6o0y5mv
openai-create-image('Racoons playing onions and perls in the style Monet', n=>3, size => 'largers')

(*"#ERROR: The argument $size is expected to be Whatever or one of '1024x1024, 256x256, 512x512, large, medium, small'.Nil"*)

Helmut Newton

my @imgRes3 = |openai-create-image('Racoon in the style Helmut Newton', n=>3, size => 'small', format=>'values', response-format=>'b64_json', :$auth-key);
@imgRes3.elems;

(*"3"*)

Magnify[ImportByteArray[BaseDecode[#]], 1.5] & /@ RakuInputExecute["@imgRes3==>encode-to-wl"]

1bg8etqj40dss

Hieronymus Bosch

my @imgRes3 = |openai-create-image('how we live now in the style of Hieronymus Bosch', n=>3, size => 'small', format=>'values', response-format=>'b64_json', :$auth-key);
@imgRes3.elems;

(*"3"*)

Magnify[ImportByteArray[BaseDecode[#]], 1.5] & /@ RakuInputExecute["@imgRes3==>encode-to-wl"]

1a1lqkfo1uzkh

Inkblots

my @imgRes4 = |openai-create-image('Racoon in the style Roschach inkblot', :$auth-key,  n=>6,  format=>'values');
@imgRes4.elems;

(*"6"*)

Magnify[Import[#], 1.5] & /@ RakuInputExecute["@imgRes4==>encode-to-wl"]

10whvhc8v7idl

Saved

1rrswxq91jawh

Alternative

Table[ResourceFunction["RandomRorschach"]["NumberOfStrokes" -> RandomChoice[{8, 12, 20}, RandomInteger[{1, 4}]], ColorFunction -> (GrayLevel[RandomReal[{0.1, 0.2}]] &), "ImageEffects" -> {{"Jitter", 16}, {"OilPainting", 6}}], 12]

0u2f1799sicyj

Goldberg machines

my @imgRes5 = |openai-create-image('Camels in a Rube Goldberg machine', :$auth-key,  n=>6,  format=>'values');
@imgRes5.elems;

(*"6"*)

Magnify[Import[#], 1.5] & /@ RakuInputExecute["@imgRes5==>encode-to-wl"]

0j12jz88emd58

Literate programming via CLI

In this blog post we demonstrate how to do “Literate programming” in Raku via (short) pipelines of Command Line Interface (CLI) scripts.

An alternative of this CLI-based process is to use Mathematica or Jupyter notebooks.

Presentation’s GitHub directory

Video recording


Steps and components

Here is a narration of the diagram above:

  1. Write some text and code in a Markdown file
  2. Weave the Markdown file (i.e. “run it”)
  3. If the woven file the does not have D3.js graphics go to 5
  4. Convert the woven Markdown file into HTML
  5. Examine results
  6. Go to 1
    • Or finish “fiddling with it”

The conversions

The we use the document “Cryptocurrencies-explorations.md” (with Raku code over cryptocurrencies data.)

If no D3.js graphics are specified then we can use the shell command:

file-code-chunks-eval Cryptocurrencies-explorations.md

If D3.js graphics are specified then we can use the shell command:

file-code-chunks-eval Cryptocurrencies-explorations.md && 
  from-markdown Cryptocurrencies-explorations_woven.md -t html -o out.html && 
  open out.html

Remark: It is instructive to examine “Cryptocurrencies-explorations_woven.md” and compare it with “Cryptocurrencies-explorations.md”.

Remark The code chunks with graphics (using “JavaScript::D3”) have to have the chunk option setting results=asis.

Remark The “JavaScript::D3” commands have to have the option settings format => 'html' and div-id => ....


References

Articles

[AA1] Anton Antonov “Raku Text::CodeProcessing”, (2021), RakuForPrediction at WordPress.

[AA2] Anton Antonov “JavaScript::D3”, (2022), RakuForPrediction at WordPress.

[AA3] Anton Antonov “Further work on the Raku-D3.js translation”, (2022), RakuForPrediction at WordPress.

Packages

[AAp1] Anton Antonov, Data::Cryptocurrencies Raku package, (2023). GitHub/antononcube.

[AAp2] Anton Antonov, JavaScript::D3 Raku package, (2022). GitHub/antononcube.

[AAp3] Anton Antonov, Text::CodeProcessing Raku package, (2021). GitHub/antononcube.

[AAp4] Anton Antonov, Markdown::Grammar, (2022). GitHub/antononcube.

Gherkin::Grammar

This blog post introduces and briefly describes the Raku package “Gherkin::Grammar” for Gherkin test specifications parsing and interpretations.

Gherkin is the language of the Cucumber framework, [Wk1], that is used to do Behavior-Driven Development (BDD), [Wk2].

The Raku package “Cucumis Sextus, [RL1], aims to provide a “full-blown” specification-and-execution framework in Raku like the typical Cucumber functionalities in other languages. (Ruby, Java, etc.)

This package, “Gherkin::Grammar” takes a minimalist perspective; it aims to provide:

  • Grammar (and roles) for parsing Gherkin specifications
  • Test file template generation

Having a “standalone” Gherkin grammar (or role) facilitates the creation and execution of general or specialized frameworks for Raku support of BDD.

The package provides the functions:

  • gherkin-parse
  • gherkin-subparse
  • gherkin-interpret

The Raku outputs of gherkin-interpret are test file templates that after filling-in would provide tests that correspond to the input specifications.

Remark: A good introduction to the Cucumber / Gherkin approach and workflows is the README of [RLp1].

Remark: The grammar in this package was programmed following the specifications and explanations in Gherkin Reference.


Installation

From Zef ecosystem:

zef install Gherkin::Grammar

From GitHub:

zef install https://github.com/antononcube/Raku-Gherkin-Grammar


Workflow

The package follows the general Cucumber workflow, but some elements are less automated. Here is a flowchart:

Here is corresponding narration:

  1. Write tests using Gherkin specs
  2. Generate test code
    • Using the package “Gherkin::Grammar”.
  3. Fill-in the code of step functions
  4. Execute tests
  5. Revisit (refine) steps 1 and/or 4 as needed
  6. Integrate resulting test file

Remark: See the Cucumber framework flowchart in the files Flowcharts.md.


Usage examples

Here is a basic (and short) Gherkin spec interpretation example:

use Gherkin::Grammar;

my $text0 = q:to/END/;
Feature: Calculation
    Example: One plus one
    When 1 + 1
    Then 2
END

gherkin-interpret($text0);

# use v6.d;
# 
# #============================================================
# 
# proto sub Background($descr) {*}
# proto sub ScenarioOutline(@cmdFuncPairs) {*}
# proto sub Example($descr) {*}
# proto sub Given(Str:D $cmd, |) {*}
# proto sub When(Str:D $cmd, |) {*}
# proto sub Then(Str:D $cmd, |) {*}
# 
# #============================================================
# 
# use Test;
# plan *;
# 
# #============================================================
# # Example : One plus one
# #------------------------------------------------------------
# 
# multi sub When( $cmd where * eq '1 + 1' ) {}
# 
# multi sub Then( $cmd where * eq '2' ) {}
# 
# multi sub Example('One plus one') {
# 	When( '1 + 1' );
# 	Then( '2' );
# }
# 
# is Example('One plus one'), True, 'One plus one';
# 
# done-testing;

Internationalization

The package provides internationalization using different languages. The (initial) internationalization keyword-regexes data structure was taken from [RLp1]. (See the file “I18n.rakumod”.)

Here is an example with Russian:

my $ru-text = q:to/END/;
Функционал: Вычисление
    Пример: одно плюс одно
    Когда 1 + 1
    Тогда 2
END

gherkin-interpret($ru-text, lang => 'Russian');

# use v6.d;
# 
# #============================================================
# 
# proto sub Background($descr) {*}
# proto sub ScenarioOutline(@cmdFuncPairs) {*}
# proto sub Example($descr) {*}
# proto sub Given(Str:D $cmd, |) {*}
# proto sub When(Str:D $cmd, |) {*}
# proto sub Then(Str:D $cmd, |) {*}
# 
# #============================================================
# 
# use Test;
# plan *;
# 
# #============================================================
# # Example : одно плюс одно
# #------------------------------------------------------------
# 
# multi sub When( $cmd where * eq '1 + 1' ) {}
# 
# multi sub Then( $cmd where * eq '2' ) {}
# 
# multi sub Example('одно плюс одно') {
# 	When( '1 + 1' );
# 	Then( '2' );
# }
# 
# is Example('одно плюс одно'), True, 'одно плюс одно';
# 
# done-testing;

Doc-string Arguments

The package takes both doc-strings and tables as step arguments.

Doc-strings are put between lines with triple quotes; the text between the quotes is given as second argument of the corresponding step function.

Here is an example of a Gherkin specification for testing a data wrangling Domain Specific Language (DSL) parser-interpreter, [AA1, AAp2], that uses doc-string:

Feature: Data wrangling DSL pipeline testing

  Scenario: Long pipeline
    Given target is Raku
    And titanic dataset exists
    When is executed the pipeline:
      """
      use @dsTitanic;
      filter by passengerSurvival is "survived";
      cross tabulate passengerSex vs passengerClass
      """
    Then result is a hash

That specification is part of the Gherkin file: “DSL-for-data-wrangling.feature”.

The corresponding code generated by “Gherkin::Grammar” is given in the file: “DSL-for-data-wrangling-generated.rakutest”.

The fill-in definitions of the corresponding functions are given in the file: “DSL-for-data-wrangling.rakutest”.

Table arguments

The package handles tables as step arguments. The table arguments are treated differently in Example or Scenario blocks than in Scenario outline blocks.

Here is a “simple” use of a table:

Feature: DateTime parsing tests

  Scenario: Simple
    When today, yesterday, tomorrow
    Then the results adhere to:
      | Spec      | Result                        |
      | today     | DateTime.today                |
      | yesterday | DateTime.today.earlier(:1day) |
      | tomorrow  | DateTime.today.later(:1day)   |

Here is a Scenario Outline spec:

Feature: DateTime parsing tests 2

   Scenario Outline: Repeated
      Given <Spec>
      Then <Result>
      Examples: the results adhere to:
         | Spec      | Result                        |
         | today     | DateTime.today                |
         | yesterday | DateTime.today.earlier(:1day) |
         | tomorrow  | DateTime.today.later(:1day)   |

Remark: The package “Markdown::Grammar”, [AAp1], parses tables in a similar manner, but [AAp1] assumes that a table field can have plain words, words with slant or weight, or hyperlinks.

Remark: The package [AAp1] parses tables with- and without headers. The Gherkin language descriptions and examples I have seen did not have tables with header separators. Hence, a header separator is treated as a regular table row in “Gherkin::Grammar”.


Complete examples

Calculator

The files Calculator.feature” and “Calculator.rakutest” provide a simple, fully worked example of how this package can be used to implement Cucumber framework workflows.

Remark: The Cucumber framework(s) expect Gherkin test specifications to be written in files with extension “.feature”.

DateTime interpretation

The date-time interpretations of the package “DateTime::Grammar”, [AAp3], are tested with the feature file “DateTime-interpretation.feature” (and the related “*.rakutest” files.)

Numeric word forms parsing

The interpretations of numeric word forms into number of the package “Lingua::NumericWordForms”, [AAp4], are tested with the feature file “Numeric-word-forms-parsing.feature” (and the related “*.rakutest” files.)

DSL for data wrangling

The data wrangling translations and execution results of the package “DSL::English::DataQueryWorkflows”, [AA1, AAp2], are tested with the feature file “DSL-for-data-wrangling.feature” (and the related “*.rakutest” files.)

This is a fairly non-trivial examples that involves multiple packages. Also, it makes a lot of sense to test DSL translators using a testing DSL (like Gherkin.)


CLI

The package provides a Command Line Interface (CLI) script. Here is its help message:

gherkin-interpretation --help

# Usage:
#   gherkin-interpretation <fileName> [-l|--from-lang=<Str>] [-t|--to-lang=<Str>] [-o|--output=<Str>] -- Interprets Gherkin specifications.
#   
#     -l|--from-lang=<Str>    Natural language in which the feature specification is written in. [default: 'English']
#     -t|--to-lang=<Str>      Language to interpret (translate) the specification to. [default: 'Raku']
#     -o|--output=<Str>       File to place the interpretation to. (If '-' stdout is used.) [default: '-']


References

Articles

[AA1] Anton Antonov, “Introduction to data wrangling with Raku” , (2021), RakuForPrediction at WordPress.

[SB1] SmartBear, “Gherkin Reference”, (2023), cucumber.io.

[Wk1] Wikipedia entry, “Cucumber (software)”. See also cucumber.io.

[Wk2] Wikipedia entry, “Behavior-driven development”.

Packages

[AAp1] Anton Antonov, Markdown::Grammar Raku package, (2022-2023), GitHub/antononcube.

[AAp2] Anton Antonov, DSL::English::DataQueryWorkflows Raku package, (2021-2023), GitHub/antononcube.

[AAp3] Anton Antonov, DateTime::Grammar Raku package, (2023), GitHub/antononcube.

[AAp4] Anton Antonov, Lingua::NumericWordForms Raku package, (2021-2023), GitHub/antononcube.

[RLp1] Robert Lemmen, Cucumis Sextus Raku package, (2017-2020), GitHub/robertlemmen.

Using Tries for Markov chain text generation

Introduction

In this document we discuss the derivation and utilization of Markov chains for random text generation. The Markov chains are computed, represented, and utilized with the data structure Tries with frequencies, [AA2, AAp1, AAv1]. (Tries are also known as “prefix trees.”)

We can say that for a given text we use Tries with frequencies to derive language models, and we generate new, plausible text using those language models.

In a previous article, [AA1], I discussed how text generation with Markov chains can be implemented with Wolfram Language (WL) sparse arrays.

Remark: Tries with frequencies can be also implemented with WL’s Tree structure. The implementation in [AAp5] — which corresonds to [AAp1] — relies only on Association, though. Further, during the Wolfram Technology Conference 2022 I think I successfully convinced Ian Ford to implement Tries with frequencies using WL’s Tree. (Ideally, at some point his implementation would be more scalable and faster than my WL implementation.)

Remark: We can say that this document provides examples of making language models that are (much) simpler than Chat GPT’s models, as mentioned by Stephen Wolfram in [SWv1]. With these kind of models we can easily generate sentences, like, “The house ate the lettuce.” or “The chair was happy.”, [SWv1].

Remark: The package “Text::Markov”, [PP1], by Paweł Pabian also generates text via Markov chains by using a specialized, “targeted” implementation. This document presents a way of doing Markov chains simulations using the general data structure Tries with frequencies.

Remark: Language models similar to the ones built below can be used to do phrase completion and contextual spell checking. The package “ML::Spellchecker”, [AAp4], uses that approach.

Remark: This (computational) Markdown document corresponds to the Mathematica notebook “Using Prefix trees for Markov chain text generation”.


Load packages

Here we load the packages [“ML::TriesWithFrequencies”](“Text::Plot”, [AAp1], “Text::Plot”, [AAp2], and “Data::Reshapers”, [AAp3]:

use ML::TriesWithFrequencies;
use Text::Plot;
use Data::Reshapers;

# (Any)


Ingest text

Download “OriginOfSpecies.txt.zip” file and uzip it.

Get the text from that file:

my $text = slurp($*HOME ~ '/Downloads/OriginOfSpecies.txt');
$text.chars

# 893121

Turn the text into separate words and punctuation characters::

my $tstart = now;

my @words = $text.match(:g, / \w+ | <punct> /).map({ ~$_ })>>.lc;

my $tend = now;
say "Time to split into words: { $tend - $tstart }.";

# Time to split into words: 1.074823841.

Some statistics:

say "Number of words: { @words.elems }.";
say "Number of unique words: { @words.Bag.elems }.";

# Number of words: 170843.
# Number of unique words: 6927.

Here is a sample:

@words.head(32).raku;

# ("introduction", ".", "when", "on", "board", "h", ".", "m", ".", "s", ".", "'", "beagle", ",", "'", "as", "naturalist", ",", "i", "was", "much", "struck", "with", "certain", "facts", "in", "the", "distribution", "of", "the", "inhabitants", "of").Seq

Here is the plot that shows Pareto principle adherence to the word counts:

text-pareto-principle-plot( @words.Bag.values.List, title => 'Word counts');

# Word counts                         
#     0.00           0.29          0.58           0.87        
# +---+--------------+-------------+--------------+----------+      
# |                                                          |      
# +               ****************************************   +  1.00
# |        ********                                          |      
# +     ****                                                 +  0.80
# |    **                                                    |      
# +   **                                                     +  0.60
# |   *                                                      |      
# |   *                                                      |      
# +   *                                                      +  0.40
# |   *                                                      |      
# +   *                                                      +  0.20
# |   *                                                      |      
# +                                                          +  0.00
# +---+--------------+-------------+--------------+----------+      
#     0.00           2000.00       4000.00        6000.00

Remark: We see typical Pareto principle manifestation: ≈10% of the unique words correspond to ≈80% of all words in the text.


Markov chain construction and utilization

Derive 2-grams:

my $n = 2;
my @nGrams = @words.rotor($n => -1);
@nGrams.elems

# 170842

Here we create a trie using 2-grams:

$tstart = now;

my $trWords = trie-create(@nGrams);

$tend = now;
say "Time to create the n-gram trie: { $tend - $tstart }.";

# Time to create the n-gram trie: 10.30988243.

Example of random n-gram generation:

.say for $trWords.random-choice(6):drop-root;

# (than do)
# (extremely difficult)
# (these considerations)
# (a great)
# (any one)
# (less sterile)

Markov chain generation:

my @generated = ['every', 'author'];

for ^30 -> $i {
    @generated = [|@generated.head(*- 1), |$trWords.retrieve([@generated.tail,]).random-choice:drop-root]
}

say @generated;

# [every author attempts to be strange contingencies are far more strictly littoral animals , owing to me to appear in comparison with hermaphrodites having had undergone a large amount of the great]

Here we modify the code line above into a sub that:

  1. Generates and splices random n-grams until $maxSentences number sentences are obtained
  2. Make the generated text more “presentable”
sub text-generation($trWords, @startPhrase, UInt $maxSentences = 10) {
    my @generated = @startPhrase>>.lc;
    my $nSentences = 0;
    loop {
        my @ph = $trWords.retrieve([@generated.tail,]).random-choice:drop-root;
        if @ph.tail ∈ <. ? !> { $nSentences++ };
        @generated = [|@generated.head(*- 1), |@ph];
        if $nSentences == $maxSentences { last; }
    }

    @generated[0] = @generated[0].tc;

    my $res = @generated.join(' ').subst(/ \s (<punct>) /, { ~$0 }):g;
    my $res2 = $res.subst(/ ('.' | '!' | '?') ' ' (\w) /, {  $0 ~ ' ' ~ $1.uc }):g;
    return $res2;
}

# &text-generation

Remark: We consider a complete sentence to finish with period, question mark, or exclamatory mark (<. ! ?>). Hence, the stopping criteria “counts” how many times those punctuation marks have been reached with the generated n-grams.

Generate sentences:

text-generation($trWords, <Consider the>, 3)

# Consider the cause other plants, the parent- when a short digression. Hence we have expected on the glacial period, owing to diverge in the destruction of the least possible. Yet suffered in the fact, of the use and this case of precious nectar out of their conditions, marvellous amount of this affinity, and permanent varieties have surmised that the several domestic breeds have descended from one hand, state of which rocky mountains of the victor to judge whether of affinity of the different varieties, so that i watched the bees alone had a single parent successful in huge fragments fall of the bees are either parent-- six or emeu, and has once begun to the nicotiana glutinosa, orders, both are not only with their migration of the arctic shells might be given of the secretion of the reptiles and for transport, the stock; for 30 days, a yellow seeds by every continent when slight successive generation; and it can easily err in our continents now find different varieties of equatorial south america to fill nearly related in successive modification is often separated than is conceivable extent that the conclusion.


Using larger n-grams

Let us repeat the text generation above with larger n-grams (e.g. 4-grams).

Here generate the n-gram trie:

my $n = 4;
my @nGrams = @words.rotor($n => -1);
my $trWords = trie-create(@nGrams);
$trWords.node-counts

# {Internal => 76219, Leaves => 53251, Total => 129470}

Generate text and show it:

my $genText = text-generation($trWords, <Every author>, 6);
.trim.say for $genText.split(/ < . ! ? > /, :v).rotor(2)>>.join;

# Every author, from experiments made during different strains or sub- species, by the effects.
# Homologous parts of the same species should present varieties; so slightly, and are the parents( a) differed largely from the same part of the organisation, which has given several remarkable case of difficulty in a single pair, or africa under nearly constant.
# And from their common and swedish turnip.
# The resemblance in its crustacea, for here, as before the glacial epoch, and which is generally very few will transmit its unaltered likeness to a distant and isolated regions could be arranged by dr.
# Two distinct organs, such as many more individuals of the same bones in the different islands.
# The most common snapdragon( antirrhinum) a rudiment of a pistil; and kolreuter found that by alph.
# De beaumont, murchison, barrande, and have not be increased; and as modern, by the chest; and it was evident in natural history, when the same species.
# Watson, to any character derived from a single bone of a limb and branch and sub- marked and well fitted for them are varieties or three alone of that individual flower: for instance in south america, has run( more especially as all very long periods, that in allied groups, in order to show that an early age, or dependent on unknown element of a vast number of bears being rendered rudimentary occasionally differs from its embryo, but afterwards breeds from his selection by some way due to be grafted together?
# Why should the sepals, we can dimly seen or quite fertile together, where it exists, they must be assumed that, after a or of i think, see the contest soon decided: for here neither cattle, sheep, and would give two or three of these ants, by the best and safest clue.
# I must believe, than to the other silurian molluscs, even in the many cases will suffice.
# We possess no one will doubt that the continued habit.
# On naked submarine rocks and making fresh water change at least be asserted that all the groups within each area are related to those of the first union of the male sexual element of facts, otherwise would have done for sheep, etc.
# , a large stock from our european seas.
# In explaining the laws of variation have proceeded from one becomes rarer and english pointer have been ample time, become modified forms, increase in countless numbers would quickly become wholly extinct, prepare the transition of any kind must have occurred to me, there has been variable in the long endurance of the other, always induces weakness and sterility in ordinary combs it will be under unchanged conditions, at least, to be connected by the same species when self- fertilising hermaphrodites do occasionally intercross( if such fossiliferous masses can we suppose that natural selection, and a third, a, tenanted by no doubt had some former period under a physiological or by having adapted to the most frequently insisted on increasing in size of the genera and their primordial parent, and england; but a few ferns and a few centuries, through natural selection; and two seeds.
# Reflect for existence amongst all appear together from experiments made in the majority of their wood, the most vigorous species; by slowly acting and in this case it will have associated with several allied species.
# M.
# Is some variation.

Remark: Using longer n-grams generally produces longer sentences since the probability to “reach” the period punctuation character with longer n-grams becomes smaller.


Language model adherence verification

Let us convince ourselves that the methodrandom-choice produces words with distribution that correspond to the trie it is invoked upon.

Here we make a trie for a certain set of words:

my @words = ['bar' xx 6].append('bark' xx 3).append('bare' xx 2).append('cam' xx 3).append('came').append('camelia' xx 4);
my $ptrOriginal = trie-create-by-split(@words).node-probabilities;
$ptrOriginal.node-counts

# {Internal => 10, Leaves => 3, Total => 13}

Here we generate random words with using the trie above and make a new trie with them:

my $ptrGenerated = trie-create($ptrOriginal.random-choice(120):drop-root).node-probabilities;
$ptrGenerated.node-counts

# {Internal => 10, Leaves => 3, Total => 13}

Here is a comparison table between the two tries:

my %tries = Original => $ptrOriginal, Generated => $ptrGenerated;
say to-pretty-table([%tries>>.form,], field-names => <Original Generated>, align => 'l');

# +----------------------------------+----------------------------------+
# | Original                         | Generated                        |
# +----------------------------------+----------------------------------+
# | TRIEROOT => 1                    | TRIEROOT => 1                    |
# | ├─b => 0.5789473684210527        | ├─b => 0.55                      |
# | │ └─a => 1                       | │ └─a => 1                       |
# | │   └─r => 1                     | │   └─r => 1                     |
# | │     ├─e => 0.18181818181818182 | │     ├─e => 0.15151515151515152 |
# | │     └─k => 0.2727272727272727  | │     └─k => 0.2727272727272727  |
# | └─c => 0.42105263157894735       | └─c => 0.45                      |
# |   └─a => 1                       |   └─a => 1                       |
# |     └─m => 1                     |     └─m => 1                     |
# |       └─e => 0.625               |       └─e => 0.6296296296296297  |
# |         └─l => 0.8               |         └─l => 1                 |
# |           └─i => 1               |           └─i => 1               |
# |             └─a => 1             |             └─a => 1             |
# +----------------------------------+----------------------------------+

We see that the probabilities at the nodes are very similar. It is expected that with large number of generation words nearly the same probabilities to be obtained.


Possible extensions

Possible extensions include the following:

  • Finding Part Of Speech (POS) label for each word and making “generalized” sequences of POS labels.
    • Those kind of POS-based language models can be combined the with the “regular”, word-based ones in variety of ways.
    • One such way is to use a POS-based model as a censurer of a word-based model.
    • Another is to use a POS-based model to generate POS sequences, and then “fill-in” those sequences with actual words.
  • N-gram-based predictions can be used to do phrase completions in (specialized) search engines.
    • That can be especially useful if the phrases belong to a certain Domain Specific Language (DSL). (And there is large enough collection of search queries with that DSL.)
  • Instead of words any sequential data can be used.
    • See [AAv1] for an application to predicting driving trips destinations.
    • Certain business simulation models can be done with Trie-based sequential models.

References

Articles

[AA1] Anton Antonov, “Markov chains n-gram model implementation, (2014), MathematicaForPrediction at WordPress.

[AA2] Anton Antonov, “Tries with frequencies for data mining”, (2013), MathematicaForPrediction at WordPress.

Packages

[AAp1] Anton Antonov, ML::TriesWithFrequencies Raku package, (2021-2023), raku.land/zef:antononcube.

[AAp2] Anton Antonov, Text::Plot Raku package, (2022), raku.land/zef:antononcube.

[AAp3] Anton Antonov, Data::Reshaperes Raku package, (2021-2023), raku.land/zef:antononcube.

[AAp4] Anton Antonov, “ML::Spellchecker”, (2023), GitHub/antononcube.

[AAp5] Anton Antonov, TriesWithFrequencies Mathematica package,, (2018), MathematicaForPrediction at GitHub.

[PP1] Paweł Pabian, Text::Markov Raku package, (2016-2023), raku.land/zef:bbkr.

Videos

[AAv1] Anton Antonov, “Prefix Trees with Frequencies for Data Analysis and Machine Learning”, (2017), Wolfram Technology Conference 2017.

[SWv1] Stephen Wolfram, “Stephen Wolfram Answers Live Questions About ChatGPT”, (2023), YouTube/Wolfram.

Further work on the Raku-D3.js translation

Introduction

This document gives motivation and design details for the Raku package “JavaScript::D3” and discusses the usefulness of implementing:

  • The package itself
  • The corresponding Python package “JavaScript::D3”, [AAp3]
  • Functions for drawing random mandalas and scribbles

Some familiarity with “JavaScript::D3” is assumed. See the article “JavaScript::D3”, or the video “The Raku-ju hijack hack of D3.js”, [AAv1].


Why I implement this package?

Using D3.js via Raku is a good answer of the question “How to do data-driven visualizations with Raku?”

I used to (or tried to use) the alternatives outlined in the following sub-sections.

SVG::Plot

The package “SVG::Plot” is a good illustration that a Raku visualization system can be build “from scratch”, but it is way too much work to develop “SVG::Plot” to have all features needed for informative plots. (Various plot labels, grid lines, etc.)

One of the best features of “SVG::Plot” is that it is supported “out of the box” by the Jupyter Raku kernel [BD1].

Chart::Gnuplot

I have attempted to install the package “Chart::Gnuplot” a few times on my computer without success. Some people from the Raku community claim it is useful.

Text::Plot

I wrote earlier this year the package “Text::Plot”, because I needed some plotting for my presentation at TRC-2022.

Its main advantage is that “it can be used anywhere.” Occasionally, I find it very useful, but it is just a crutch, not a real plotting solution.

For example, it is hard to have informative time series plotted with “Text::Plot”. In support of that statement see this time series plot done with Mathematica for the number of IRC Raku channel posts per week in the last three years (by certain known posters):

That is why!

When I do data analysis I want to be able to:

  1. Make nice and informative plots
  2. Use concise plot specifications
  3. Quickly do the required setup
  4. Claim easy reproducibility and portability of the related documents

Using D3.js definitely provides 1. The package “JavaScript::D3” aims to provide 2. Setting up Jupyter (notebooks) with the Raku kernel implemented by Brian Duggan, [BD1], is not that hard. (But be warned that YMMV.) The Jupyter notebooks are widely shareable at this point, so, 4 is also satisfied.

Here is an example bubble plot made in Jupyter:


Why I did not implement this a year ago?

I do not like Jupyter much, and I mostly stay away from it. That is why, 1.5 years ago when I read the code of “Jupyter::Kernel” I did not try to comprehend the magics. (I was interested in sand-boxing Raku.)

Brian Duggan has put examples how the magics can be used to embed HTML and SVG code in Jupyter, [BD1]. Those prompted me to utilize that framework to use JavaScript.


Refactoring via re-implementation in Python

Re-implementing the package (from Raku) to Python was a great way to brainstorm the necessary refactoring of the JavaScript code snippets and related package architecture. See the examples in [AAp3].


Multi-dataset visualizations

The initial version of “JavaScript::D3” did not have multi-dataset support for bar charts and histograms. I figured out how to do that for bar charts; still working on histograms.

Here is a bar chart example:



Random mandalas and scribbles

Another way to “challenge” the package design and implementation is to (try to) implement functions that draw random mandalas, [AAv2], and random scribbles.

Implementing the corresponding functions required to:

  1. Have optional axes specification
  2. Be able to produce D3.js without placement wrappers (like HTML declaration, of JavaScript brackets for Jupyter.)

The refactoring for 2. was required in order to have multiple plots in one Jupyter cell. Also, it was fairly involved, although, it is “just” for gluing the main, core plotting parts.

I wanted to do as much as possible on the Raku side:

  • The random points generation and “arrangement” is done in Raku
  • The plotting is done with D3.js

Remark: The algorithms for drawing, say, Bezier curves through the random points are far from trivial and D3.js does this very nicely.

Here is an example of random mandala generation:

Here is an example of random scribbles generation:

For interactive examples, see the video “Random mandalas generation (with D3.js via Raku)”.


Some “leftover” remarks

  • My Jupyter setup is “local” on my laptop. I have tried to use Raku in cloud setups a few times without much success.
  • I think there are some issues with the ZMQ connections between Raku’s “Jupyter::Kernel” and “the big” Jupyter framework. Observing those difficulties and making reproducible bug reports is not easy.
  • If more people use Raku with Jupyter that would help or induce reliability improvements.
  • (I hope my demos with “JavaScript::D3” would make some Rakunistas interrupt their VI addiction and try the Raku-Jupyter synergy.)

References

Articles

[AA1] Anton Antonov, “JavaScript::D3”, (2022), RakuForPrediction at WordPress.

[OV1] Olivia Vane, “D3 JavaScript visualisation in a Python Jupyter notebook”, (2020), livingwithmachines.ac.uk.

[SF1] Stefaan Lippens, Custom D3.js Visualization in a Jupyter Notebook, (2018), stefaanlippens.net.

Packages

[AAp1] Anton Antonov, Data::Reshapers Raku package, (2021-2022), GitHub/antononcube.

[AAp2] Anton Antonov, Text::Plot Raku package, (2022), GitHub/antononcube.

[AAp3] Anton Antonov, JavaScriptD3 Python package, (2022), Python-packages at GitHub/antononcube.

[BD1] Brian Duggan, Jupyter::Kernel Raku package, (2017-2022), GitHub/bduggan.

[MLp1] Moritz Lenz, SVG::Plot Raku package (2009-2018), GitHub/moritz.

Videos

[AAv1] Anton Antonov, “The Raku-ju hijack hack of D3.js”, (2022), Anton Antonov’s channel at YouTube.

[AAv2] Anton Antonov, “Random mandalas generation (with D3.js via Raku)”, (2022), Anton Antonov’s channel at YouTube.

Interactive Mermaid diagrams generation via Markdown evaluation

Introduction

In this document (and related presentation) we discuss the interactive making of Mermaid-JS diagrams via evaluation of code cells in Markdown documents.

The “interactive” changes are possible because of the following package updates:

Further, the “interactivity” relies on the automatic re-rendering of the used Integrate Development Environments (IDEs), like, IntelliJ IDEACommaide, or Visual Studio Code.

Remark: The preparation of this document and in the presentation, we use Command Line Interface (CLI) script file-code-chunks-eval provided by “Text::CodeProcessing”.

Remark: “Text::CodeProcessing” also provides the script cronify that facilitates periodic execution of a shell command (with parameters.) It heavily borrows ideas and code from the chapter “Silent Cron, a Cron Wrapper” of the book, “Raku Fundamentals” by Moritz Lenz, [ML1].

Remark: After some experimentation the script cronify was not found to be that useful for the “interactive” effect.


Presentation plan

Here is a flowchart of the presentation:


UML diagram

Here we load the package “UML::Translators”, [AAp2], and derive the Mermaid-JS spec for “ML::Clustering”:

use UML::Translators;
to-uml-spec('ML::Clustering', format => 'mermaid')

# classDiagram
# class ML_Clustering_KMeans {
#   +BUILDALL()
#   +args-check()
#   +bray-curtis-distance()
#   +canberra-distance()
#   +chessboard-distance()
#   +cosine-distance()
#   +distance()
#   +euclidean-distance()
#   +find-clusters()
#   +hamming-distance()
#   +manhattan-distance()
#   +norm()
#   +squared-euclidean-distance()
# }
# ML_Clustering_KMeans --|> ML_Clustering_DistanceFunctions
# 
# 
# class ML_Clustering_DistanceFunctions {
#   <<role>>
#   +args-check()
#   +bray-curtis-distance()
#   +canberra-distance()
#   +chessboard-distance()
#   +cosine-distance()
#   +distance()
#   +euclidean-distance()
#   +manhattan-distance()
#   +squared-euclidean-distance()
# }
# 
# 
# class k_means {
#   <<routine>>
# }
# k_means --|> Routine
# k_means --|> Block
# k_means --|> Code
# k_means --|> Callable
# 
# 
# class find_clusters {
#   <<routine>>
# }
# find_clusters --|> Routine
# find_clusters --|> Block
# find_clusters --|> Code
# find_clusters --|> Callable

Here we create directly a Mermaid cell into the Markdown file:

use UML::Translators;
to-uml-spec('ML::Clustering', format => 'mermaid')

Remark: We use above the Markdown cell arguments perl6, outputLang=mermaid, outputPrompt=NONE.


Pie chart

Here we generate a dataset with random numerical columns

use Data::Generators;
use Data::Reshapers;
my @tbl = random-tabular-dataset(12, 3, 
        column-names-generator => { &random-pet-name($_, method => &pick) },
        generators => [
            { random-variate(NormalDistribution.new( µ => 10, σ => 20), $_ ) },
            { random-variate(NormalDistribution.new( µ => 2, σ => 2), $_ ) },
            { random-variate(NormalDistribution.new( µ => 32, σ => 10), $_ ) }]);
say to-pretty-table(@tbl);

# +-----------+-----------+------------+
# |  Schmidt  | Loch Ness |  Atticus   |
# +-----------+-----------+------------+
# | -1.640302 | 28.378750 | 15.904661  |
# |  2.385714 | 38.302540 | 24.213946  |
# |  0.493925 | 38.725973 | 40.612606  |
# |  4.505899 | 30.662067 | 18.584966  |
# |  0.434399 | 16.541336 | 16.912362  |
# |  3.721586 | 44.420603 | 33.003731  |
# |  2.167267 | 34.555019 |  9.839326  |
# |  0.816624 | 25.008163 | -36.824675 |
# |  0.473044 | 32.095658 | 38.149468  |
# |  3.436236 | 23.021088 | 24.750144  |
# |  5.754028 | 38.892485 | 20.964051  |
# | -0.590391 | 23.630289 | -8.537537  |
# +-----------+-----------+------------+

Here we sum the columns:

@tbl.&transpose.map({ $_.key => [+] $_.value })

# (Loch Ness => 374.233971547851 Schmidt => 21.95802958385876 Atticus => 197.57304922350048)

Plot the sums with a Mermaid pie chart:

say 'pie showData';
say ' title My Great Pie Chart!';
@tbl.&transpose.map({ $_.key => [+] $_.value }).map({ say " {$_.key.raku} : {$_.value}" })

Remark: The pie chart in this blog post is “uploaded” — meaning most likely does not correspond to the table above.


Make trie plot

Here we make a prefix tree (trie) with frequencies:

use ML::TriesWithFrequencies;
my $tr = trie-create-by-split( <bar bark bars balm cert cell> );
trie-say($tr);

# TRIEROOT => 6
# ├─b => 4
# │ └─a => 4
# │   ├─l => 1
# │   │ └─m => 1
# │   └─r => 3
# │     ├─k => 1
# │     └─s => 1
# └─c => 2
#   └─e => 2
#     ├─l => 1
#     │ └─l => 1
#     └─r => 1
#       └─t => 1

Here we just get nodes:

say 'mindmap';
say $tr.form.subst( / '├' | '─' | '└' | '│' | '└' /, ' '):g

mindmap
TRIEROOT => 6
  b => 4
    a => 4
      l => 1
        m => 1
      r => 3
        k => 1
        s => 1
  c => 2
    e => 2
      l => 1
        l => 1
      r => 1
        t => 1

Here we plot it with Mermaid-JS as a mindmap (requires version 9.2.2+):

say 'mindmap';
say $tr.form.subst( / '├' | '─' | '└' | '│' | '└' /, ' '):g;

Here we transform the trie into list of edges:

my @edges = $tr.node-probabilities.root-to-leaf-paths>>.map({ "{$_.key}:{$_.value.Str}" }).map({ $_.rotor(2 => -1).map({ "{$_[0]} --> {$_[1]}" }) }).&flatten;
.say for @edges.unique; 

# TRIEROOT:1 --> b:0.6666666666666666
# b:0.6666666666666666 --> a:1
# a:1 --> r:0.75
# r:0.75 --> s:0.3333333333333333
# r:0.75 --> k:0.3333333333333333
# a:1 --> l:0.25
# l:0.25 --> m:1
# TRIEROOT:1 --> c:0.3333333333333333
# c:0.3333333333333333 --> e:1
# e:1 --> l:0.5
# l:0.5 --> l:1
# e:1 --> r:0.5
# r:0.5 --> t:1

Here we plot it with Mermaid-JS as a graph:

say 'graph TD';
.say for @edges.unique; 

References

Articles

[AA1] Anton Antonov “Text::CodeProcessing”, (2021), RakuForPrediction at WordPress.

[AA2] Anton Antonov “Conversion and evaluation of Raku files”, (2022), RakuForPrediction at WordPress.

[AA3] Anton Antonov, “Generating UML diagrams for Raku namespaces, (2022), RakuForPrediction at WordPress.

Books

[ML1] Moritz Lenz, “Raku Fundamentals: A Primer with Examples, Projects, and Case Studies”, 2nd ed. (2020), Apress.

Packages

[AAp1] Anton Antonov, Text::CodeProcessing Raku package, (2021-2022), Zef ecosystem.

[AAp2] Anton Antonov, UML::Translators Raku package, (2021-2022), Zef ecosystem.

Videos

[AAv1] Anton Antonov “Conversion and evaluation of Raku files”, (2022), Anton Antonov’s channel at YouTube.

Lingua::Translation::DeepL

In brief

This blog posts proclaims and describes the Raku package “Lingua::Translation::DeepL” which provides access to the language translation service DeepL DeepL, [DL1]. For more details of the DeepL’s API usage see the documentation, [DL2].

Remark: To use the DeepL API one has to register and obtain authorization key.

Remark: This Raku package is much “less ambitious” than the official Python package, [DLp1], developed by DeepL’s team. Gradually, over time, I expect to add features to the Raku package that correspond to features of [DLp1].


Installation

Package installations from both sources use zef installer (which should be bundled with the “standard” Rakudo installation file.)

To install the package from Zef ecosystem use the shell command:

zef install Lingua::Translation::DeepL

To install the package from the GitHub repository use the shell command:

zef install https://github.com/antononcube/Raku-Lingua-Translation-DeepL.git


Usage examples

Remark: When the authorization key, auth-key, is specified to be Whatever then deepl-translation attempts to use the env variable DEEPL_AUTH_KEY.

Basic translation

Here is a simple call (automatic language detection by DeepL and translation to English):

use Lingua::Translation::DeepL;
say deepl-translation('Колко групи могат да се намерят в този облак от точки.');

# [{detected_source_language => BG, text => How many groups can be found in this point cloud.}]

Multiple texts

Here we translate from Bulgarian, Russian, and Portuguese to English:

my @texts = ['Препоръчай двеста неща от рекомендационната система smrGoods.',
              'Сделать классификатор с логистической регрессии',
              'Fazer um classificador florestal aleatório com 200 árvores'];
my @res = |deepl-translation(@texts,
        from-lang => Whatever,
        to-lang => 'English',
        auth-key => Whatever);
        
use Data::Reshapers;
#.say for @res;
to-pretty-table(@res, align=>'l', field-names=><text detected_source_language>)

# +-----------------------------------------------------------------------+--------------------------+
# | text                                                                  | detected_source_language |
# +-----------------------------------------------------------------------+--------------------------+
# | Recommend two hundred things from the smrGoods recommendation system. | BG                       |
# | Make a classifier with logistic regression                            | RU                       |
# | Make a random forest classifier with 200 trees                        | PT                       |
# +-----------------------------------------------------------------------+--------------------------+

Remark: DeepL allows up to 50 texts to be translated in one API call. Hence, if the first argument is an array with more than 50 elements, then it is partitioned into up-to-50-elements chunks and those are given to deepl-translation.

Formality of the translations

The argument “formality” controls whether translations should lean toward informal or formal language. This option is only available for some target languages; see [DLp1] for details.

say deepl-translation('How are you?', to-lang => 'German', auth-key => Whatever, formality => 'more');
say deepl-translation('How are you?', to-lang => 'German', auth-key => Whatever, formality => 'less');

# [{detected_source_language => EN, text => Wie geht es Ihnen?}]
# [{detected_source_language => EN, text => Wie geht es dir?}]

say deepl-translation('How are you?', to-lang => 'Russian', auth-key => Whatever, formality => 'more');
say deepl-translation('How are you?', to-lang => 'Russian', auth-key => Whatever, formality => 'less');  

# [{detected_source_language => EN, text => Как Вы?}]
# [{detected_source_language => EN, text => Как ты?}]

Languages

The function deepl-translation verifies that the argument languages given to it are valid DeepL from- and to-languages. See the section “Request Translation”.

Here we get the mappings of abbreviations to source language names:

deepl-source-languages()

# {bulgarian => BG, chinese => ZH, czech => CS, danish => DA, dutch => NL, english => EN, estonian => ET, finnish => FI, french => FR, german => DE, greek => EL, hungarian => HU, indonesian => ID, italian => IT, japanese => JA, latvian => LV, lithuanian => LT, polish => PL, portuguese => PT, romanian => RO, russian => RU, slovak => SK, slovenian => SL, spanish => ES, swedish => SV, turkish => TR, ukrainian => UK}

Here we get the mappings of abbreviations to target language names:

deepl-target-languages()

# {bulgarian => BG, chinese simplified => ZH, czech => CS, danish => DA, dutch => NL, english => EN, english american => EN-US, english british => EN-GB, estonian => ET, finnish => FI, french => FR, german => DE, greek => EL, hungarian => HU, indonesian => ID, italian => IT, japanese => JA, latvian => LV, lithuanian => LT, polish => PL, portuguese => PT, portuguese brazilian => PT-BR, portuguese non-brazilian => PT-PT, romanian => RO, russian => RU, slovak => SK, slovenian => SL, spanish => ES, swedish => SV, turkish => TR, ukrainian => UK}


Command Line Interface

The package provides a Command Line Interface (CLI) script:

deepl-translation --help

# Usage:
#   deepl-translation <text> [-f|--from-lang=<Str>] [-t|--to-lang=<Str>] [-a|--auth-key=<Str>] [--formality=<Str>] [--timeout[=UInt]] [--format=<Str>] -- Text translation using the DeepL API.
#   
#     <text>                  Text to be translated.
#     -f|--from-lang=<Str>    Source language. [default: 'Whatever']
#     -t|--to-lang=<Str>      Target language. [default: 'English']
#     -a|--auth-key=<Str>     Authorization key (to use DeepL API.) [default: 'Whatever']
#     --formality=<Str>       Language formality in the translated text. [default: 'Whatever']
#     --timeout[=UInt]        Timeout. [default: 10]
#     --format=<Str>          Format of the result; one of "json" or "hash". [default: 'json']

Remark: When the authorization key argument “auth-key” is specified set to “Whatever” then deepl-translation attempts to use the env variable DEEPL_AUTH_KEY.


Mermaid diagram

The following flowchart corresponds to the steps in the package function deepl-translation:


References

[DL1] DeepL, DeepL Translator.

[DL2] DeepL, DeepL API.

[DLp1] DeepL, DeepL Python Library, (2021), GitHub/DeepLcom.

Conversion and evaluation of Raku files

This post proclaims a presentation in which we do Raku files conversion and evaluation. Those procedures facilitate Literate programming with Raku.


Presentation plan


Mermaid diagram


Jupyter notebook

SystemOpen["http://localhost:8888/tree/GitHub/lizmat/articles"]

Raku setup

In order to use Raku in Mathematica we have to:

  • Load a “Raku mode” package
  • Covert the notebook into “Raku mode”
  • Start a Raku session.

Remark: If no Raku session is started then every Raku cell evaluation is “on its own”, i.e. independent from the rest.

Import["https://raw.githubusercontent.com/antononcube/ConversationalAgents/master/Packages/WL/RakuMode.m"]
StartRakuProcess["Raku" -> $HomeDirectory <> "/.rakubrew/shims/raku"]

(* KillRakuProcess[]
   KillRakuSockets[] *) 

Here convert the “hosting” notebook into “Raku mode”;

RakuMode[]

Raku cell example:

1+1_000
(*"1001"*)

All shell commands

 1048  git pull
 1049  cat dont-fear-the-grepper-* > dont-fear-the-grepper-all.md
 1050  open dont-fear-the-grepper-all.md
 1051  file-code-chunks-eval dont-fear-the-grepper-all.md
 1052  open dont-fear-the-grepper-all_woven.md
 1053  from-markdown dont-fear-the-grepper-all.md
 1054  from-markdown dont-fear-the-grepper-all.md -t=pod6 -o=dont-fear-the-grepper-all.pod6
 1055  open dont-fear-the-grepper-all.pod6
 1056  file-code-chunks-eval dont-fear-the-grepper-all.pod6
 1057  open dont-fear-the-grepper-all_woven.pod6
 1058  from-markdown dont-fear-the-grepper-all.md | pbcopy
 1059  open dont-fear-the-grepper-all_woven.md
 1060  from-markdown --help
 1061  from-markdown dont-fear-the-grepper-all.md --raku-code-cell-name=RakuInputExecute --l=raku | pbcopy
 1064  jupytext --help
 1066  jupytext --from=md --to=ipynb dont-fear-the-grepper-all.md -o=dont-fear-the-grepper-all.ipynb
 1067  open dont-fear-the-grepper-all.ipynb
 1068  open dont-fear-the-grepper-all.md
 1069  jupytext --from=md --to=ipynb dont-fear-the-grepper-all.md -o=dont-fear-the-grepper-all.ipynb
 1070  open http://localhost:8888/tree/GitHub/lizmat/articles 


References

Articles

[AA1] Anton Antonov, “Raku Text::CodeProcessing”, (2021), RakuForPrediction at WordPress.

[AA2] Anton Antonov, “Connecting Mathematica and Raku”, (2021), RakuForPrediction at WordPress.

[EM1] Elizabeth Mattijsen, “Don’t fear the grepper” series of articles at Dev.to.

Packages and programs

[AAp1] Anton Antonov, “Text::CodeProcessing” Raku package, (2021), GitHub/antononcube.

[AAp2] Anton Antonov, “Markdown::Grammar” Raku package, (2022), GitHub/antononcube.

[BDp1] Brian Duggan, Jupyter::Kernel Raku package, (2017), GitHub/bduggan.

[JTp1] jupytext , “Jupyter Notebooks as Markdown Documents, Julia, Python or R Scripts”, jupytext.readthedocs.io.

Implementing Machine Learning algorithms in Raku (TRC-2022 talk)

Last weekend (on 2022-08-13) I gave a presentation titled “Implementing Machine Learning algorithms in Raku” at the The Raku Conference 2022.

As the title hints, in the presentation we discuss the implementations of different Machine Learning (ML) algorithms in Raku. 

The main themes of the presentation are: 

– ML workflows demonstration

– Software engineering perspective on ML implementations 

– ML algorithms utilization with Raku’s unique features 

Here is a mind-map of the presentation:

Here is a mind-map of the considered ML algorithms: 

The presentation “slides” are available as:

Link to presentation recording.