Category Archives: programming

Preserving Color in Style Transfer

Image style transfer is extremely fun and I had to just return to it one more time. I noticed that some people have started adding color transfer, as a recent paper came out on Preserving Color in Neural Artistic Style Transfer .

Throughout this post I’m modifying the following photo I took back in 2011 in Ireland.

A street in Ireland (2011 vacation)

Take this Starry Night Van Gogh rendition.

It is… blue.

But by preserving color, the result is much nicer. The color is transferred from the content image to the style image before the stylized version is generated. The mean and covariance across the RGB channels is updated in the style image to match the content image.

The other method mentioned in the paper is by luminance transfer. I will probably implement that as well and add it here — the math is very similar, and simpler in fact, as covariance matrix is not needed as there is only one channel being changed.

Sometimes color transfer isn’t needed, but when it is, it’s pretty handy. It’s remarkable that this can be done by simply changing some statistics in the image.

Here are a couple of cool ones from today — they did not require/use color transfer but I had to include them because they are pretty cool!

DCGAN on MNIST

The code for this implementation is on github.

As part of the fast.ai Deep Learning For Coders part 2 course, we implemented the original GAN and DCGAN. The original GAN implementation uses a simple multi-layer perceptrons for the generator and discriminator, and it does not work very well. DCGAN uses a convolutional architecture and does better. However, the result in the course notebook did not look impressive despite many thousands of epochs of training. It’s possible to get it to a better state as I have succeeded in doing here. There is room for more improvement still. I suspect the instructor did not try too hard to show that though, because the newer WGAN, introduced a bit later, is better and easier to train. However, this post is about implementing the DCGAN.

It took me a while to get it to work well. Despite trying many things, what seemed to finally do it was switching from Adam optimizer to RMSPROP, and weight clipping. These things, together with Dropout and Batch Normalization, stabilized training. I did not do comprehensive experiments, but 5×5 convolutions seemed to perform better on this particular dataset.

Here you can see how the GAN learns to generate increasingly better novel handwritten digits.

output over epochs

The outputs look realistic, but they can be improved. Future improvements and code are on github.

Generating Art with Computers

In the world of deep learning, there’s never a dull moment. The breadth of interesting applications seems unbounded. It’s been applied to reach super-human performance in many areas like game playing and object recognition, and integral to exciting new technologies like self driving cars. Increasingly we find that the knowledge embodied in a trained neural network can be transferred to seemingly unrelated areas. Take for example the combination of two disparate ideas: object recognition and Picasso paintings. A net built for the sole purpose of recognizing objects in a photo has been found to be useful, completely unaltered, for the task of rendering an image in the style of any painting. This is the technology behind the scenes of apps like Prisma, but it is not hard to do yourself, and I have to say that implementing this was a lot of fun.

original image

These images were created, as described in Gatys et. al, by passing a random noise image through the VGG-16 convolutional network (with imagenet weights and top layers removed) and updating the pixels of the input image directly with gradient descent. An error function is devised, which quantifies how poorly the initial seed (random noise at first) balances between appearing like the original image, and at the same time in the style of the chosen painting. The derivative of the loss with respect to the RGB values of the noise image is calculated. The pixels are adjusted in their respective directions, and the process repeats. Amazingly, this works – the image looks more and more like a stylized version of the original image with each iteration of this optimization. This is due to the unreasonable effectiveness of gradient descent, convolutional neural networks, and well designed loss functions.

The error function combines content loss E_c and style loss E_s. This balances between preserving high level features of the original image, and with the textures of the painting. The errors compare the values of chosen convolutional layers when the input is the original image, generated image, or painting.

The content error is just the squared euclidian distance between corresponding filter activations under the stylized and original images.
\displaystyle E_c^l=\frac{1}{2}\sum _{i,j} {\left\|F_{i_j}^l-P_{i_j}^l\right\|^2} where F_{ij}^l is the activation of the i^{th} filter at position j in layer l.

The novelty that makes all this work is the style error. It can be done with a statistic on the channels of the convolutions in the higher layers of the network. This was originally designed to capture texture information in a texture synthesis algorithm. The style error of a layer is the mean squared error between the gramians, \displaystyle E_s^l=\frac{1}{4 N^2 M^2} \sum _{ij} {\left(G_{i_j}^l-A_{i_j}^l\right)^2} , where G_{ij}=\langle v_i,v_j \rangle is the inner product of the vectorized feature maps i and j in filter layer l of the style image, and A is the corresponding gramian when the input is noise. The inner product (gramian) shows the correlation between each pair of channels, and this captures texture information. The same thing in Keras code:

def gram_matrix(v):
    # In tensorflow, dim order is x,y,channel
    # Make each row a channel
    chans = K.permute_dimensions(x, (2, 0, 1))
    # vectorize the feature maps
    features = K.batch_flatten(chans)
    # gramian is just an inner product
    return K.dot(features, K.transpose(features)) 
             / x.get_shape().num_elements()
def style_loss(vi, vj):
    # mean squared error
    return mse(gram_matrix(vi), gram_matrix(vj))

The total error is \alpha E_c + \beta E_s with the weights on each error as hyperparameters. The results are quite astounding. Playing with weights \alpha and \beta, and with the weights on the contribution of each layer l towards the style loss allows for a large range of interesting results. This picture was created by decreasing the weight of the content loss.

One major drawback of this method can be seen in the creation above. The style loss must be tempered, or else it obstructs features of the image, like the face. The third set of images, in the style of “Woman in a Hat with Pompoms and a Printed Shirt” is another example. So portraits in general are not the best target. However, landscapes and the like come out amazing. The reason for this is that the style loss function does not take into account anything about the style image except the texture of the image. There are alternative statistics that have been tried. One successful result that I have not yet tried comes from the study of Markov Random Fields, used classically for image synthesis. The idea is to calculate the loss between patches of the filters, rather than the whole filter at once, where the loss of each patch of the generated image is calculated against the most similar patch (by cross correlation) for the painting.

Another drawback is that each generated image/painting combo must be calculated separately. This has been addressed by Johnson et. al and others by training a neural network which can turn an input image into a representation which, when passed through a loss network (such as VGG-16 as above), generates a stylized image of a particular style. The benefit is the speed – once trained, generating a stylized version of an input image is hundreds of orders of magnitude faster. The drawback is that the transformation network takes much longer to train and is only able to output images of the specific style it was trained on. However, this is the type of solution that can scale, for example to video. I have replicated exactly the network architecture as described in Johnson et. al. Here’s an example result:

I had trouble getting rid of artifacts showing up in some input images, like the blotch of white on the right of the stylized image. However, I did not train the network for very long, primarily to avoid large AWS GPU server bills, so the results are not that great. I’ll probably come back to this soon, as I am building my own GPU server! There are so many ideas to explore with this.

My own chess engine

I’ve written a chess engine named Slonik. It implements the Universal Chess Interface (UCI), so you can download any popular chess interface, like Scid vs Pc. or Chessbase, to analyze with or play against Slonik.

I’ve written this engine from scratch, and chose to write it in Python, so that I can iterate quickly. That makes the engine slower, but maybe one day I will port it to C++. However, I am happy with it’s playing strength, all considering. The details of the engine are on the github page, but to summarize:

  • Alpha-beta minimax, quiescence search
  • Bitboard piece/board representation
  • Various search heuristics, such as the history heuristic, extensions, reductions, etc.
  • Hand-coded evaluation function
  • Transposition hash table

I plan to return to working on this engine’s AI — specifically to use deep learning and reinforcement learning techniques rather than the current hand-coded evaluation function.

WordPress backup script

In my previous post I showed my WordPress update script. However, it’s not safe to update without first backing everything up in case something goes wrong. This is a script that I adapted from this post. It backs up both files and the database.

#!/bin/bash

echo "In $0"

if [ $# -gt 0 ]; then
    NOW=$1
else
    NOW=$(date +"%Y-%m-%d-%H%M")
fi

FILE="maksle.com.$NOW.tar"
BACKUP_DIR="/home/private/backups"
WWW_DIR="/home/public/blog"

DB_HOST="dbhost"
DB_USER="backupUser"
DB_PASS="backupUserPassword"
DB_NAME="wp_db"
DB_FILE="maksle.com.$NOW.sql"

# WWW_TRANSFORM='s,^home/public/blog,www,'
# DB_TRANSFORM='s,^home/private/backups,database,'
WWW_TRANSFORM=',/home/public/blog,www,p'
DB_TRANSFORM=',/home/private/backups,database,'


# tar -cvf $BACKUP_DIR/$FILE --transform $WWW_TRANSFORM $WWW_DIR
tar -cvf $BACKUP_DIR/$FILE -s $WWW_TRANSFORM $WWW_DIR

mysqldump --host=$DB_HOST -u$DB_USER -p$DB_PASS $DB_NAME > $BACKUP_DIR/$DB_FILE

# tar --append --file=$BACKUP_DIR/$FILE --transform $DB_TRANSFORM $BACKUP_DIR/$DB_FILE
tar --append --file=$BACKUP_DIR/$FILE -s $DB_TRANSFORM $BACKUP_DIR/$DB_FILE
rm $BACKUP_DIR/$DB_FILE
gzip -9 $BACKUP_DIR/$FILE

You may have noticed that there is a commented out version of the tar transform variable and command. My host has a version of tar (bsdtar 2.8.5) that doesn’t have the --transform option, but does have an alternative -s option that does more or less the same thing. The idea is that the backup will have directory stucture backup/file.php rather than /home/public/blog/file.php for example.

mysqldump has many options you can pass it, which you may want to look into. However, the option --opt is a default, and does what I want. It is probably good enough for most sites. The problem with --opt is that it requires locking the table during the export, which also has implications on permissions required for your backup user. What backup user? Well, since you are storing the DB user and password in plain text in your script, you should not use your administrator user. It’s best to create a backup user with minimal permissions necessary to do the backup. Ideally that would be just SELECT privileges, but with the mentioned --opt option, LOCK TABLES privileges are required too. Here’s how you’d set that user up:

MySQL> CREATE USER backup IDENTIFIED BY 'randompassword';
MySQL> GRANT SELECT ON *.* TO backup;
MySQL> GRANT LOCK TABLES ON *.* TO backup;

I call the above script from a cron job on my local computer:

#!/bin/bash

# Exit if any command fails
set -e 
# Don't allow use of unintialized variables
set -u 


# Set up some variables
NOW=$(date +"%Y-%m-%d-%H%M")
BACKUP_DIR="$HOME/Documents/backups"
LOG_DIR="${BACKUP_DIR}/logs"
LOG_FILE="maksle-backup-$NOW.log"

# Redirect standard output and error output to a log file.
exec > >(tee -a "${LOG_DIR}/${LOG_FILE}")
exec 2> >(tee -a "${LOG_DIR}/${LOG_FILE}" >&2)

mkdir -p $LOG_DIR
cd $BACKUP_DIR

# The cool part: Run my local wp-backup.sh on the remote web server.
ssh maksle 'bash -s' < ~/bin/wp-backup.sh $NOW

# Sync the remote server backup logs with the backups directory on my local machine. After all, what good are backups if your webserver is down and you can't access them?
rsync -havz --stats maksle:/home/private/backups/ $BACKUP_DIR

Of course, the remote server can get filled up with backups, so I have another script that removes any backups more than 5 days old. I continue to have as many as far back as I want on my local machine.

#!/bin/bash

set -e
set -u

# Error out if a command in a pipe fails
set -o pipefail

# Usage example:
# wp-remove-old-backups.sh /home/private/backups 5

WORKING_DIR=$1
cd $WORKING_DIR

# This would be 5 if called as in the Usage example 
declare -i allow=$2
# This gets the number of files in the directory, which we assume are all backup tgz files
declare -i num=$(ls | wc -l)

if [ $num -gt $allow ]; then
    # Remove all but latest files
    (ls -t | head -n $allow; ls) | sort | uniq -u | sed -e 's,.*,"&",g' | xargs rm -f
fi

The above command works by first printing the latest 5 files, and then all the files. This way the latest 5 files get printed twice. This allows uniq -u to filter out the latest 5, and the rest of the files get sent to their slaughter. The intermediate sed -e 's,.*,"&",g' makes it work when there are spaces in the filenames by wrapping the filenames in quotes (avoid spaces in filenames).

Of course, I call this script via a local cron job as well.

#!/bin/bash

BACKUP_DIR="$HOME/Documents/backups"
LOG_DIR="${BACKUP_DIR}/logs"
LOG_FILE="maksle-backup-cleanup-$NOW.log"

exec > >(tee -a "${LOG_DIR}/${LOG_FILE}")
exec 2> >(tee -a "${LOG_DIR}/${LOG_FILE}" >&2)

ssh maksle 'bash -s' < ~/bin/wp-remove-old-backups.sh "/home/private/backups" 5

I hope that will help someone out!

Wordupress update script

WordPress offers the one-click update, but the file permissions required for that convenience are a security risk. For it to work, it essentially requires setting all files to the server group (usually web or apache or nobody user) and giving all those files group write permissions. Doing so trades security for convenience. Eventually there will be a security vector in the WordPress code, and with writeable PHP files everywhere, hackers will make short work of it.

WordPress provides manual updating instructions, and even gives a few code snippets here and there, but there’s really nothing there that should require human intervention. This little script updates WordPress to the latest version. The location of this script should be in a location on the web server not accessible to the web, which is /home/private/update-wp in my case.

#!/bin/bash

set -u
set -e

# Cleanup from a previous call
rm -f latest.tar.gz
rm -rf wordpress
rm -rf backuptemp

# Get the latest, unzip it, and untar it
wget https://wordpress.org/latest.tar.gz
tar -xzvf latest.tar.gz

# The location of your wordpress install
blog=/home/public/blog

# Copy these just in case
mkdir backuptemp
cp $blog/wp-config.php $blog/.htaccess backuptemp

# These are the files to be deleted as mentioned in the WordPress Manual Update link
rm $blog/wp*.php
rm $blog/license.txt $blog/readme.html $blog/xmlrpc.php
rm -rf $blog/wp-admin $blog/wp-includes

# Copy the files to overwrite what we have
# It will leave files alone that are in $blog/wp-content but not in the latest bundle which is what we want
rsync -avz wordpress/ "${blog}/"
cp backuptemp/wp-config.php backuptemp/.htaccess $blog

echo "DONE"

If something goes wrong you have your daily backups to save you (because you are backing things up, aren’t you?). I will write another post shortly showing my WordPress files and database backup script.

First Pull Request

I have just made my first pull request on github. https://github.com/magnars/expand-region.el/pull/148

My contribution was to Magnar Sveen’s awesome expand-region project. The fix was for nxml-mode. Expand region inside an xml attribute was including the outer quotes first before first expanding to just the inner quotes. It was also not properly expanding to the attribute when there are namespaces in the attribute. This fix amends that.

Magnar messaged me that expand-region is headed for the emacs core. Awesome! All contributors need to sign the Free Software Foundation copyright papers. See https://gnu.org/licenses/why-assign for reasons. I went ahead and emailed assign@gnu.org and signed away my copyright on this piece of code.

I’m pretty excited to see this go through, because not everyone’s first pull request ever incidentally also makes it into a major FSF project, let alone into EMACS core!

etags-update-mode

Just a few days ago I wrote my first EMACS minor-mode, called etags-update-mode. It updates your TAGS file on save. It’s heavily inspired by another package/minor mode with the same name by Matt Keller.

In order to update the tags for a file on save, Matt’s etags-update-mode calls a perl file to delete any previous tags for a specific file in a TAGS file before it appends the new definitions in the file. Also, with that package the minor mode is defined as a global minor mode.

I wanted the functionality that the package provided, but I didn’t want it to be a global minor mode (the only global minor mode that I’ve used that I’m aware of and that I like having everywhere is YaSnippet). I also didn’t see why there should be a reliance on perl. I wanted to do it all in elisp.

So I wrote a much simpler version of etags-update-mode that is a regular minor mode and does all it’s work in EMACS. I’ll be updating it as I continue to use it.

EMACS etags

EMACS has an etags.el package that supports use of etags, the EMACS version of ctags. It tags your source code so you can jump directly to the source for a function, variable, or other symbol. I’ve been using it heavily with C++ and C# (though for C++, I’ve supplanted it with GNU Global, and there is an EMACS package for that too, ggtags).

I wanted the same functionality for xslt, which I use heavily at work. Luckily exuberant-ctags and etags both provide support for extending support to other languages, by supplying regular expressions.

I put the following regular expressions in ~/.ctags:

--langdef=xslt
--langmap=xslt:.xsl
--regex-xslt=/<xsl:template name="([^"]*)"/1/
--regex-xslt=/<xsl:template match="[^"]*"[ \t\n]+mode="([^"]*)"/1/
--regex-xslt=/<xsl:variable name="([^"]+)"/1/

… and generate the TAGS file

ctags -e -o TAGS *.xsl

I can now jump to the definition of any variable or template in my xsl files!

First try at Data Munging

I’ve been taking Udacity course Exploratory Data Analysis and decided that I wanted to try my hand at a real data set that I cared about. I ran into several obstacles that are probably common and I hope that this will help someone else.

The data I cared about was in SQL Server so first I got the data out:

bcp "select .. from .. where .." queryout data.dat -c -t"||||" -S server -U user -P pass

I chose “||||” as my delimiter because I was fairly sure that no value had four pipe characters. It’s much easier to search the file for a good delimiter once it’s in a text file. Once the data was out, I searched through the file data.dat and found that there were no asterisks in the entire file so I replaced all “||||” with “*” as my delimiters.

sed -i 's/||||/*/g' data.dat

I tried to load this into R with mydata <- read.csv("data.dat", sep="*") but ran into a problem:

Warning messages:
1: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 2 appears to contain embedded nulls

I eventually realized that anything which was either NULL or an empty string in the SQL Server database comes out as 0x00, a binary null character. EMACS represents the binary null as ^@. I replaced these binary marks with ‘NA’ in EMACS with M-x replace-string ENT ^@ ENT NA ENT. As a side note, you can position the cursor on a symbol you want to know about and do M-x describe-char, it will tell you a lot of information about it. Another way to replace the symbol if you haven’t experienced the life and file altering wonders of EMACS is

sed -i 's/\x0/NA/g' data.dat

Now I tried read.csv and it seemed to work without errors, but I noticed that the number of ‘observations’ that R thinks are in the file (dim(mydata)) is not the same as the number of lines in the file, so I knew something was wrong. To see the number of lines in a file you can do wc -l output.dat in the terminal.

It took me quite some time to figure it out. The following finally worked correctly:

mydata <- read.table("data.dat", na.strings=c("", "NA"), sep="*", comment.char="", quote="")

?read.csv reveals that it actually calls read.table internally and makes some assumptions for you. One of those assumptions is sep="," but we specified that. The ones that got me were comment.char and quote. Actually, read.csv assumes that comment.char is "" which disables commenting altogether, which is good (for my data), but read.table sets it to "#". Additionally, read.csv sets quote="\"" by default. Initially after using read.table rather than read.csv, I started getting these types of errors:

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 9237 did not have 8 elements

I checked the line it complained about but it had 8 elements. I know that sometimes errors happen earlier than where the error message indicates. For a sanity check, I wrote this quick little diddy in Python to check the element count on each line:

#!/usr/bin/env python

linenum = 0
badlines = []

with open('data.dat', 'r') as orders:
    for line in orders.readlines():
        linenum = linenum + 1
        count = line.split('*');
        if not len(count) == 8:
            badlines.append(linenum)

print badlines

However, this came back with an empty array so I knew that there was something else going on. Once I took a closer look at the documentation though, and set quote="", disabling quotes altogether, I finally had no errors, and had the correct number of observations.

Also, while in the help page for read.table/read.csv, I found that na.strings was helpful to tell R to interpret blank fields as NA. By setting na.strings=c("", "NA"), we're telling R to interpret both "" and "NA" as NA.

There's more data manipulation I may need to do but for now I can finally start looking at the data.