Busy people need to arrive at solutions.
However, it can worth exploring AOP solutions to get a better understanding of the problem or to learn new techniques or ways of writing code.
"It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures." This quote is from Alan Perlis' Epigrams on Programming (1982).
F¨
and rank operator F⍤j k
are both loops conceptually.Given 12 salaries of employees in 4 groups:
s ← {⎕RL←42 ⋄ 20000+5000×?⍵⍴3}12
g ← {⎕RL←42 ⋄ ⎕A[?12⍴⍵]}4
⎕←↑g s
For each salary, if the person is in group B, increase by $10\%$. If the person is in group D, increase by $5\%$. Otherwise, salary remains the same.
s×1+(0.05×'D'=g)+(0.1×'B'=g)
Factor this out as a general solution Increase
.
Increase ← {
⍺: (groups to increase) (percentage increases)
⍵: (groups) (corresponding salaries)
←: salaries increased
}
'BD' (10 5) Increase g s
35000 35000 25000 25000 35000 35000 33000 33000 35000 35000 26250 35000
A workshop begins at 10:00. Arriving
'early'
.'on time'
.'late'
.start ← 600 610 ⍝ Start time window in minutes since midnight
arrive ← 590+10,20,?10⍴30 ⍝ Arrival time of 12 participants
⎕←arrive
↑(arrive)('early' 'on time' 'late'[1++⌿start∘.≤arrive])
('early' 'on time' 'late'[1+start⍸arrive])
This problem comes from phase 2 of the 2022 APL Problem Solving Competition.
Each year, Dyalog Ltd sponsors a user meeting where Dyalog staff and users have an opportunity to present topics of interest and interact with one another. Due to the impact of COVID-19, the meetings for 2020 and 2021 were conducted virtually using Zoom. People registered ahead of time and could then sign-on and attend, virtually, any or all sessions. There were two partial days of sessions in each year.
After the conclusion of the user meeting, Zoom sent Dyalog Ltd a CSV file containing information including when each attendee joined or left the meeting. The tasks in this problem involve analyzing this information. There are two files that you will need for this problem:
AttendeesSA1.csv consists of 5 columns:
ScheduleSA1.csv consists of 6 columns:
To reduce the amount of data parsing and focus on the algorithm, we have converted timestamps into 1-second precision time numbers. These are the last 2 columns in the data. We have also removed people who registered but did not attend from the original data.
If you are interested in trying to parse the original data, the files are:
If you're using Dyalog 18.2, try the ]Get
user command:
]Get "C:\Users\rpark\Documents\APL\Workshops\Dyalog22\SA1 Idiomatic Expression and Array Oriented Solutions in APL\GitHub Publish 2022 SA1\attendeesSA1.csv"
]Get "C:\Users\rpark\Documents\APL\Workshops\Dyalog22\SA1 Idiomatic Expression and Array Oriented Solutions in APL\GitHub Publish 2022 SA1\scheduleSA1.csv"
attendeesSA1↓⍨←1 ⍝ Remove header rows
scheduleSA1↓⍨←1
Otherwise, this is the incantation to bring in the data programmatically:
path ← 'C:\Users\rpark\Documents\APL\Workshops\Dyalog22\SA1 Idiomatic Expression and Array Oriented Solutions in APL\GitHub Publish 2022 SA1\AttendeesSA1.csv'
3↑attendeesSA1 ← ⊃⎕CSV path ⍬ 4 1 ⍝ Discard header row
path ← 'C:\Users\rpark\Documents\APL\Workshops\Dyalog22\SA1 Idiomatic Expression and Array Oriented Solutions in APL\GitHub Publish 2022 SA1\ScheduleSA1.csv'
3↑scheduleSA1 ← ⊃⎕CSV path ⍬ 4 1 ⍝ Discard header row
Write a dyadic function Attended
with the syntax:
result ← attendees Attended schedule
which returns a $333 \times 48$ Boolean matrix in which:
uattendees
).schedule
.a 1
in result[i;j]
indicates that uattendee[i]
attended schedule[j;]
.
Attendance means that they were present for at least half of the time (in minutes) that the session was held.
For example, for a session 14:00-14:30, if an attendee joins at 14:00 and leaves at 14:14, they did not attend.
who ← 'Zaria Matthews' 'Kathryn Stafford' 'Marlene Lin' 'Kayleigh Rodgers'
+⌿attendeesSA1 Rude scheduleSA1
9 9 2 5 3 0 1 1 0 1 2 4 12 1 0 0 4 1 1 2 3 0 1 6 4 1 1 1 7 1 2 1 1 2 3 0 8 4 1 0 4 1 4 1 2 0 2 0
(who[⍋who]),⍪↓((attendeesSA1[;1]∊who)⌿attendeesSA1)Attended scheduleSA1
┌────────────────┬───────────────────────────────────────────────────────────────────────────────────────────────┐
│Kathryn Stafford│1 1 1 1 1 1 1 1 1 0 0 0 1 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0│
├────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│Kayleigh Rodgers│0 1 1 1 0 0 0 0 0 0 0 1 1 1 1 0 1 1 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1│
├────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│Marlene Lin │0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0│
├────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────┤
│Zaria Matthews │0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0│
└────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────┘
Write a dyadic function Rude
with the syntax:
result ← attendeesSA1 Rude scheduleSA1
which returns a $333\times48$ integer matrix in which:
uattendees
).schedule
.1
in result[i;j]
indicates that uattendee[i]
joined schedule[j;]
after it started and left before it ended - within a single join (single row of attendeeSA1
). ⍴r←attendeesSA1 Rude scheduleSA1
333 48
6↑(⊂∘⍒⌷⊢)((⊂∘⍋⌷⊢)∪attendeesSA1[;1]),⍨⍪+/attendeesSA1 Rude scheduleSA1
┌─┬───────────────┐
│5│Lyric Clarke │
├─┼───────────────┤
│3│Tristin Heath │
├─┼───────────────┤
│3│Teagan Garrison│
├─┼───────────────┤
│3│Sam Robles │
├─┼───────────────┤
│3│Jaylee Estes │
├─┼───────────────┤
│3│Jaliyah Andrade│
└─┴───────────────┘
Don't bother if not at least 2× speedup
]runtime -c "expr1" "expr2"
-50% 2× faster
+100% 2× slower
A data API has pagination. Pagination is when the user can request a subset of the data (a page) from the data based on a variable number of items per page.
An example web API call might be /api/data?per_page=20&page=3
return the 3rd page of data for 20 items per page. That is, return items 41-60 in
⎕IO←1
.
When we return this data, we also return references to other pages. A simplified version of the Paginate
function behaves as follows:
Paginate ← {
⍺: integer vector (2=≢⍺) :: (page per_page)
⍵: integer scalar :: ≢data
←: integer vector :: indices of pages
}
⍳n
. p
is within the first 5, return 1... 7,(n-1),n
. Otherwise:p
is within the last 5, return 1,2,(n-7)... n
. Otherwise:1,2,(p-2),(p-1),p,(p+1),(p+2),(n-1),n
.The returned indices is always a unique integer list in ascending order.
Examples:
3 7 Paginate 23
1 2 3 4
5 7 Paginate 100
1 2 3 4 5 6 7 14 15
6 7 Paginate 100
1 2 4 5 6 7 8 14 15
10 7 Paginate 100
1 2 8 9 10 11 12 14 15
11 7 Paginate 100
1 2 9 10 11 12 13 14 15
Write 3 functions:
Paginate1
using explicit control structures - :Keywords
or {dfn : guards}
Paginate2
which generates all necessary indices and removes duplicatesPaginate3
which only generates required indices, but uses no explicit control structures]aplcart fast partitioned
Nested arrays allow for a wealth of intuitive and useful ways to manipulate collections of ragged data. However, the techniques from before nested arrays can be very fast.
Flat, homogeneous arrays are stored sequentially in memory. Their shape information at the front (counting elements is fast) and the ravel of elements afterwards.
ABCD
EFGH
IJKL
MNOP
QRST
UVWX
2 3 4 A B C D E F G H I J K L M N O...
Nested arrays are pointer arrays internally, so it can take longer for the interpreter to traverse memory looking for the data.
┌────┬────┬────┐
│ABCD│EFGH│IJKL│
├────┼────┼────┤
│MNOP│QRST│UVWX│
└────┴────┴────┘
In this section, we will investigate the flat array versions of some partitioned array techniques.
⊃(⊣,' ',⊢)/⌽¨' '(≠⊆⊢)'split reverse and then join'
s←' ' ⋄ Pad←s∘(⊣,,⍨) ⋄ Unpad←1↓¯1∘↓
ReversePart←s∘{⍵[⌽⍒+\1@1⊢⍺=⍵]}
ReverseWords ← Unpad⍤ReversePart⍤Pad
ReverseWords 'reverse the scan'
Regex with ⎕R/⎕S
is very powerful for a wide variety of text manipulations. However, performance is usually much better if you can use APL directly.
'[^\w]'⎕R'*'⊢'replace these $#!? non word characters'
a←¯1⎕C⎕A
{r←⍵ ⋄ ((~r∊a)/r)←'*' ⋄ r}'replace these $#!? non word characters'
{r←⍵ ⋄ r[⍸~r∊a]←'*' ⋄ r}'replace these $#!? non word characters'
Defanging an IP address means to surround full stops '.'
with square brackets [.]
. Apparently this is to prevent users from accidentally clicking on malicious links.
Write a function Defang
which accepts as argument a simple character vector which consists of groups of digits interspersed with full stops ('.'≡⎕UCS 46
) and returns a simple character where full stops are now surrounded by square brackets.
Defang ← {
⍵ : simple char vec :: IP address
← : simple char vec :: Defanged IP address
}
Defang '192.45.344.22'
192[.]45[.]344[.]22
No peeking on APLCart!
With partitioned-enclose ⍺⊂⍵
, new partitions of ⍵
begin positive integers in ⍺
and continue up to the next positive integer.
1 0 1 0 0 ⊂ 'ABCDE'
Write a function equivalent to {+/¨⍺⊂⍵}
which does not create intermediate nested arrays.
Watch out for edge cases:
⍺⊂⍵
(v18 feature)?n←¯3+?12⍴6
p ← 1 0 0 1 0 1 0 0 0 0 1 0
⎕←↑p n
+/¨p⊂n
This is a number-list sequence in which the next list of numbers is found by counting the adjacent equal numbers in the current list.
Look | Say |
---|---|
1 | One 1 |
1 1 | Two 1s |
2 1 | One 2, one 1 |
1 2 1 1 | One 1, one 2, two 1s |
1 1 1 2 2 1 | Three 1s, two 2s, one 1 |
3 1 2 2 1 1 | One 3, one 1, two 2s, two 1s |
... | ... |
First, solve this problem in any way you see fit.
Then, see if you can leverage flat array techniques to make a fast solution.
Feel free to search APLCart to find flat array idioms to help you.
The monadic function LookAndSay
takes a look-and-say sequence and returns the next one.
LookAndSay⍣6⊢1
1 3 1 1 2 2 2 1
LookAndSay⍣7⊢1
1 1 1 3 2 1 3 2 1 1
The monadic function SayAndLook
takes a look-and-say sequence and returns the previous one.
SayAndLook 1 1 1 3 2 1 3 2 1 1
1 3 1 1 2 2 2 1
In "snail sort", we want to unravel the elements of an array in the order of a spiral beginning at the top left and moving clockwise and inwards towards the center of the spiral.
A recursive solution is as follow:
⎕←5 5⍴⎕A
{0∊⍴⍵:0⍴⍵ ⋄ (1⌷⍵),∇⊖⍉1↓⍵} 5 5⍴⎕A
Task: Try to write a function which does snail sort without looping or recursion.
+\
is likely useful.
A non-looping solution is provid ed in the following summary tag.
SnailSort ← {
s←⊃⍴⍵
i←+\(1↓2/⌽⍳s)/(¯1+2×s)⍴(+,-)1,s
(,⍵)[i]
}
You might notice that this solution only works for square matrices.
Task: Generalise the solution to work with any shape matrix.