Deriving the Y-Combinator

The Y Combiantor is one of the most confusing pieces of code I’ve ever seen.

const Y = (f) =>
  ( x => f(v => x(x)(v)) )(
    x => f(v => x(x)(v))
  );

Used as such:

const factorial = Y(function (fac) {
  return function (n) {
    return (n == 0 ? 1 : n * fac(n - 1));
  }
});
 
factorial(5)
  //=> 120

Of course, that’s cheating. Real men use it like this:

((f) =>
  ( x => f(v => x(x)(v)) )(
    x => f(v => x(x)(v))
  )(function (fac) {
  return function (n) {
    return (n == 0 ? 1 : n * fac(n - 1));
  }
}))(5);
  //=> 120

because it looks cooler that way, and because the goal of the Y Combinator is to do recursion without naming any functions. It has no practical use, but it has theoretical importance. And oh yeah, it’s an awesome way to sink your time if you’re looking for a good puzzle.

I stared at it for 3 hours before I understood it.

Oh the joy of figuring things out… But then I thought to myself, could I derive it if I didn’t know it?

We’re trying to achieve the equivalent of this, without naming the function, of course:

function fac (n) {
  return (n == 0 ? 1 : n * fac(n - 1));
}

A first pass:

(function (fac, n) {
  return (n == 0 ? 1 : n * fac(fac, n - 1));
}(function (fac, n) {
  return (n == 0 ? 1 : n * fac(fac, n - 1));
}, 5));
  //=> 120

That accomplishes it, but that’s silly. The second pass is much better:

(function (fac) {
  return function (n) {
    return fac(fac)(n);
  }
}(function(fac) {
  return function (n) {
    return (n == 0 ? 1 : n * fac(fac)(n - 1));
  }
}))(5);
  //=> 120

That works, so I guess that qualifies as a solution, but it has a big flaw that the canonical Y Combinator doesn’t – we have to call fac(fac)(n-1) in the exposed function instead of just fac(n-1). Not cool… But wait! That function taking n is the same as the one in the Y Combinator. Or is it? I got stuck here for a while, and I think I even started talking to myself at some point, so I decided to try a different approach.

Time to cheat a little bit, with a named function. Too much lambda. Need name. More understand.

(function (wrapper) {
  function injector() {
    return wrapper(n => injector()(n));
  }
  return injector();
}(function(fac) {
  return function (n) {
    return (n == 0 ? 1 : n * fac(n - 1));
  }
}))(5);
  //=> 120

This idea took a bit of a leap, but it turns out this is exactly what the Y Combinator is doing, in a more sane, readable way of course. I named the function injector, because it is injecting a function into the fac parameter of the wrapper function. An alternative name might have been recur or runner, because the function it’s injecting does those things. This makes things clearer. But I can’t use named functions. Let’s translate this back to crazy.

Now I can use the idea from the 2nd pass of passing the function to itself:

(function (wrapper) {
  return
  (function (injector) {
    return wrapper(n => injector(injector)(n));
  }(function (injector) {
    return wrapper(n => injector(injector)(n));
  }));
}(function(fac) {
  return function (n) {
    return (n == 0 ? 1 : n * fac(n - 1));
  }
}))(5);
  //=> 120

Yikes. injector(injector) looks crazy, but that’s just what we have to do when we use the pass-function-to-self-trick. After all, it is a function passing itself to itself, so if you want to use it, you have to have it call itself. Where am I?

Anyway, time to reap the benefits. This is exactly the Y Combinator! To finish it off, and for absolute correctness, we have to rewrite it to make it as obscure as possible, with less descriptive names.

var result = (f =>
 (x => f(n => x(x)(n)))
 (x => f(n => x(x)(n))))
(fac => n => (n == 0 ? 1 : n * fac(n - 1)))(5);
  //=> 120

There, that’s better.

Javascript: generators for async

I stumbled across this article by James Long (@jlongster) on using generators for async as an alternative to async/await. His Q.async example, using Kris Kowal’s (@kriskowal) Q library, caught my eye and I decided to try implementing the async function without peeking at Q’s code. It took me a little while to come up with this solution, but the result is pleasingly simple!

async accepts a generator and returns a function that returns a promise. The yielded expressions in the generator are the fulfilled values. This allows running asynchronous code in synchronous style, without using async/await. Here’s a simple example of how it’s used:

Notice that you can yield promises or plain values, it doesn’t matter. Also, this supports using try/catch with asynchronous calls. Pretty cool!

For comparison, here is the Q.async implementation. After writing this up I also found the solution here (essentially same as Q.async). It differs from mine in that I’m not using an explicit try/catch to turn the synchronous error into an asynchronous one, but it happens implicitly due to being wrapped in a Promise, so I think the result/behavior is the same.

Firefox add-on released

I recently released my first Firefox add-on. I’m always copy-pasting words from the browser into my terminal to look them up with dict client. It’s simply the best dictionary tool I’ve ever used, because it looks up words from many dictionaries at once. You can find out more about DICT here and here.

It looks up words by calling dict client on your machine in a sub-process. It can also automatically save words you look up into a list to review later. Double click a word on any web page, and the extension will spawn a process to call dict and then display the output in a popup. As an added feature, the words you look up are automatically added to a list for later review.

Screenshot of dict-extension in use

This addon does not actually implement the DICT protocol, nor call any DICT servers on it’s own. It delegates that entirely to the dict client on your machine. As an alternative to my add-on, this extension is quite good and does implement the DICT protocol, if that is what you are looking for.

However, I suspect the above mentioned add-on, which does implement a DICT client, will stop working at some point relatively soon, because like many useful add-ons, it uses require('chrome'), and Mozilla is doing away with the add-on SDK and many of it’s low level APIs. A lot of developers are understandably upset about that. I was going to implement it as well, but due to these plans by Mozilla, I decided to stay away, as there’s currently no way to do it without using chrome and the low-level APIs.

I think that my add-on will only work on Linux, and maybe on Mac (though I have only tested on my machine), as you must have the dict client installed for it to work.

You can install the add-on from the Firefox add-on listing page. The code is also hosted on github.

WordPress backup script

In my previous post I showed my WordPress update script. However, it’s not safe to update without first backing everything up in case something goes wrong. This is a script that I adapted from this post. It backs up both files and the database.

#!/bin/bash

echo "In $0"

if [ $# -gt 0 ]; then
    NOW=$1
else
    NOW=$(date +"%Y-%m-%d-%H%M")
fi

FILE="maksle.com.$NOW.tar"
BACKUP_DIR="/home/private/backups"
WWW_DIR="/home/public/blog"

DB_HOST="dbhost"
DB_USER="backupUser"
DB_PASS="backupUserPassword"
DB_NAME="wp_db"
DB_FILE="maksle.com.$NOW.sql"

# WWW_TRANSFORM='s,^home/public/blog,www,'
# DB_TRANSFORM='s,^home/private/backups,database,'
WWW_TRANSFORM=',/home/public/blog,www,p'
DB_TRANSFORM=',/home/private/backups,database,'


# tar -cvf $BACKUP_DIR/$FILE --transform $WWW_TRANSFORM $WWW_DIR
tar -cvf $BACKUP_DIR/$FILE -s $WWW_TRANSFORM $WWW_DIR

mysqldump --host=$DB_HOST -u$DB_USER -p$DB_PASS $DB_NAME > $BACKUP_DIR/$DB_FILE

# tar --append --file=$BACKUP_DIR/$FILE --transform $DB_TRANSFORM $BACKUP_DIR/$DB_FILE
tar --append --file=$BACKUP_DIR/$FILE -s $DB_TRANSFORM $BACKUP_DIR/$DB_FILE
rm $BACKUP_DIR/$DB_FILE
gzip -9 $BACKUP_DIR/$FILE

You may have noticed that there is a commented out version of the tar transform variable and command. My host has a version of tar (bsdtar 2.8.5) that doesn’t have the --transform option, but does have an alternative -s option that does more or less the same thing. The idea is that the backup will have directory stucture backup/file.php rather than /home/public/blog/file.php for example.

mysqldump has many options you can pass it, which you may want to look into. However, the option --opt is a default, and does what I want. It is probably good enough for most sites. The problem with --opt is that it requires locking the table during the export, which also has implications on permissions required for your backup user. What backup user? Well, since you are storing the DB user and password in plain text in your script, you should not use your administrator user. It’s best to create a backup user with minimal permissions necessary to do the backup. Ideally that would be just SELECT privileges, but with the mentioned --opt option, LOCK TABLES privileges are required too. Here’s how you’d set that user up:

MySQL> CREATE USER backup IDENTIFIED BY 'randompassword';
MySQL> GRANT SELECT ON *.* TO backup;
MySQL> GRANT LOCK TABLES ON *.* TO backup;

I call the above script from a cron job on my local computer:

#!/bin/bash

# Exit if any command fails
set -e 
# Don't allow use of unintialized variables
set -u 


# Set up some variables
NOW=$(date +"%Y-%m-%d-%H%M")
BACKUP_DIR="$HOME/Documents/backups"
LOG_DIR="${BACKUP_DIR}/logs"
LOG_FILE="maksle-backup-$NOW.log"

# Redirect standard output and error output to a log file.
exec > >(tee -a "${LOG_DIR}/${LOG_FILE}")
exec 2> >(tee -a "${LOG_DIR}/${LOG_FILE}" >&2)

mkdir -p $LOG_DIR
cd $BACKUP_DIR

# The cool part: Run my local wp-backup.sh on the remote web server.
ssh maksle 'bash -s' < ~/bin/wp-backup.sh $NOW

# Sync the remote server backup logs with the backups directory on my local machine. After all, what good are backups if your webserver is down and you can't access them?
rsync -havz --stats maksle:/home/private/backups/ $BACKUP_DIR

Of course, the remote server can get filled up with backups, so I have another script that removes any backups more than 5 days old. I continue to have as many as far back as I want on my local machine.

#!/bin/bash

set -e
set -u

# Error out if a command in a pipe fails
set -o pipefail

# Usage example:
# wp-remove-old-backups.sh /home/private/backups 5

WORKING_DIR=$1
cd $WORKING_DIR

# This would be 5 if called as in the Usage example 
declare -i allow=$2
# This gets the number of files in the directory, which we assume are all backup tgz files
declare -i num=$(ls | wc -l)

if [ $num -gt $allow ]; then
    # Remove all but latest files
    (ls -t | head -n $allow; ls) | sort | uniq -u | sed -e 's,.*,"&",g' | xargs rm -f
fi

The above command works by first printing the latest 5 files, and then all the files. This way the latest 5 files get printed twice. This allows uniq -u to filter out the latest 5, and the rest of the files get sent to their slaughter. The intermediate sed -e 's,.*,"&",g' makes it work when there are spaces in the filenames by wrapping the filenames in quotes (avoid spaces in filenames).

Of course, I call this script via a local cron job as well.

#!/bin/bash

BACKUP_DIR="$HOME/Documents/backups"
LOG_DIR="${BACKUP_DIR}/logs"
LOG_FILE="maksle-backup-cleanup-$NOW.log"

exec > >(tee -a "${LOG_DIR}/${LOG_FILE}")
exec 2> >(tee -a "${LOG_DIR}/${LOG_FILE}" >&2)

ssh maksle 'bash -s' < ~/bin/wp-remove-old-backups.sh "/home/private/backups" 5

I hope that will help someone out!

Wordupress update script

WordPress offers the one-click update, but the file permissions required for that convenience are a security risk. For it to work, it essentially requires setting all files to the server group (usually web or apache or nobody user) and giving all those files group write permissions. Doing so trades security for convenience. Eventually there will be a security vector in the WordPress code, and with writeable PHP files everywhere, hackers will make short work of it.

WordPress provides manual updating instructions, and even gives a few code snippets here and there, but there’s really nothing there that should require human intervention. This little script updates WordPress to the latest version. The location of this script should be in a location on the web server not accessible to the web, which is /home/private/update-wp in my case.

#!/bin/bash

set -u
set -e

# Cleanup from a previous call
rm -f latest.tar.gz
rm -rf wordpress
rm -rf backuptemp

# Get the latest, unzip it, and untar it
wget https://wordpress.org/latest.tar.gz
tar -xzvf latest.tar.gz

# The location of your wordpress install
blog=/home/public/blog

# Copy these just in case
mkdir backuptemp
cp $blog/wp-config.php $blog/.htaccess backuptemp

# These are the files to be deleted as mentioned in the WordPress Manual Update link
rm $blog/wp*.php
rm $blog/license.txt $blog/readme.html $blog/xmlrpc.php
rm -rf $blog/wp-admin $blog/wp-includes

# Copy the files to overwrite what we have
# It will leave files alone that are in $blog/wp-content but not in the latest bundle which is what we want
rsync -avz wordpress/ "${blog}/"
cp backuptemp/wp-config.php backuptemp/.htaccess $blog

echo "DONE"

If something goes wrong you have your daily backups to save you (because you are backing things up, aren’t you?). I will write another post shortly showing my WordPress files and database backup script.

Tagedit.el for nxml-mode

If you use EMACS and have used lisp, you may have heard of paredit and smartparens. They allow you to operate on the Abstract Syntax Tree directly which can require a bit of a mind shift to get used to. This has been said: “If you think paredit is not for you then you need to become the kind of person that paredit is for.”

Check out this segment of a talk with Magnar Sveen, one of my biggest EMACS inspirations, discuss paredit. Here is Magnar showing off his use of paredit.

If you have used or heard of paredit, then you may have also heard about tagedit. It’s basically bringing some paredit features to html editing. I’ve been using it for a while and it’s both a pleasure to use and a huge time saver.

For a while it has been bothering me that I can’t use those awesome features when working on XML. I felt there is just no reason why I should get to enjoy that in html-mode but not in nxml-mode. nXML is the standard mode for xml in EMACS. I use it heavily at work for editing XSLT files.

This past weekend I wrote tagedit-nxml.el, a small package that makes tagedit compatible with nxml-mode. The “problem” was that tagedit was made with html-mode in mind, which derives from sgml-mode and uses sgml-mode functions to traverse the document. nxml-mode, however, is not derived from sgml-mode, but from text-mode, and traversing the document just doesn’t work the same way. Luckily, most of the functions I needed to modify were made available by tagedit.el to override. After showing the package to Magnar, the author of tagedit, he quickly provided function overrides that I needed to avoid having to use defadvice (functions like forward-list and backward-sexp). I can’t wait to start using it at work. This was a lot of fun and I learnd a lot of awesome elisp features.

XSLT dependency viewer for EMACS

I’ve written an XSLT dependency viewer for EMACS. It’s very similar to the package found here http://www.thoughtcrime.us/software/xslt-dependency-graph/. However, that library is for XLST 2.0 while I have to use XSLT 1.0 at work.

The parsing of the files to traverse the import/includes is done in EMACS lisp, which generates a dot diagram. That is then piped into the graphviz dot data visualization program and opened in your favorite PDF viewer. Graphviz is like LaTeX but for generating graphs of all kinds. Check out this graphviz dot guide that will give you an idea what it is capable of. Pretty powerful stuff.

First Pull Request

I have just made my first pull request on github. https://github.com/magnars/expand-region.el/pull/148

My contribution was to Magnar Sveen’s awesome expand-region project. The fix was for nxml-mode. Expand region inside an xml attribute was including the outer quotes first before first expanding to just the inner quotes. It was also not properly expanding to the attribute when there are namespaces in the attribute. This fix amends that.

Magnar messaged me that expand-region is headed for the emacs core. Awesome! All contributors need to sign the Free Software Foundation copyright papers. See https://gnu.org/licenses/why-assign for reasons. I went ahead and emailed assign@gnu.org and signed away my copyright on this piece of code.

I’m pretty excited to see this go through, because not everyone’s first pull request ever incidentally also makes it into a major FSF project, let alone into EMACS core!

etags-update-mode

Just a few days ago I wrote my first EMACS minor-mode, called etags-update-mode. It updates your TAGS file on save. It’s heavily inspired by another package/minor mode with the same name by Matt Keller.

In order to update the tags for a file on save, Matt’s etags-update-mode calls a perl file to delete any previous tags for a specific file in a TAGS file before it appends the new definitions in the file. Also, with that package the minor mode is defined as a global minor mode.

I wanted the functionality that the package provided, but I didn’t want it to be a global minor mode (the only global minor mode that I’ve used that I’m aware of and that I like having everywhere is YaSnippet). I also didn’t see why there should be a reliance on perl. I wanted to do it all in elisp.

So I wrote a much simpler version of etags-update-mode that is a regular minor mode and does all it’s work in EMACS. I’ll be updating it as I continue to use it.

EMACS etags

EMACS has an etags.el package that supports use of etags, the EMACS version of ctags. It tags your source code so you can jump directly to the source for a function, variable, or other symbol. I’ve been using it heavily with C++ and C# (though for C++, I’ve supplanted it with GNU Global, and there is an EMACS package for that too, ggtags).

I wanted the same functionality for xslt, which I use heavily at work. Luckily exuberant-ctags and etags both provide support for extending support to other languages, by supplying regular expressions.

I put the following regular expressions in ~/.ctags:

--langdef=xslt
--langmap=xslt:.xsl
--regex-xslt=/<xsl:template name="([^"]*)"/1/
--regex-xslt=/<xsl:template match="[^"]*"[ \t\n]+mode="([^"]*)"/1/
--regex-xslt=/<xsl:variable name="([^"]+)"/1/

… and generate the TAGS file

ctags -e -o TAGS *.xsl

I can now jump to the definition of any variable or template in my xsl files!

musings on programming, chess, martial arts, and other interests