Converting Lucas Chess personal opening guides to PGN

Most chess software tools generally allow for importing/exporting between any of their own formats for chess games or positions or whatever and PGN. Lucas Chess seems to be, for the most part, no exception; however, while it allows for importing from PGN to its Personal Opening Guide structure (.pgo files), it oddly doesn’t have an option to export in the other direction. I’m not a Lucas Chess user myself, but I stumbled upon that fact in a forum post from someone wanting to export their .pgo file to PGN for use in some other chess software, and asking if someone could program a converter for them.

That piqued my curiosity, so here I’ll lay out a quick way to get from a .pgo file to a standard .pgn file that can be imported to other programs, using nothing but an R script and David Barnes’ free pgn-extract utility.

It turns out that each .pgo file is just a SQLite database containing a single table called GUIDE. Some of the columns of the GUIDE table look like this:

XPV                 PV      POS
---                 --      ---
DT                  c2c4    1
DTn^                e7e5    2
DTn^;L              b1c3    3
DTn^;Lxg            g8f6    4
DTn^;Lxg@O          g1f3    5
DTn^;Lxg@Osd        b8c6    6
DTn^;Lxg@OsdFN      e2e3    7
DTn^;Lxg@OsdFNwS    f8b4    8

As can be seen, the XPV column contains strings that encode sequences of moves (in some encoding scheme that we don’t need to worry about), and the corresponding PV entry gives the final move of the sequence in long algebraic notation. The XPV for the eighth entry above encodes the sequence (in standard algebraic notation) 1.c4 e5 2.Nc3 Nf6 3.Nf3 Nc6 4.e3 Bb4. Ultimately, we need to get all of the move sequences contained in the GUIDE table into this standard algebraic notation that .pgn files use. We do this in two steps: first an R script that constructs the sequences in long algebraic notation, and then using pgn-extract to convert that to a proper .pgn file that can be imported and used freely.

The following R script converts a file called lucas_guide.pgo into a file called intermediate_output.txt that consists of all move sequences from the GUIDE table, presented in long algebraic notation.

## Path to your Lucas Chess .pgo file
yourPGOfile <- "lucas_guide.pgo"
## Path to the output file for this script, which will then feed to
## pgn-extract
outputFilePath <- "intermediate_output.txt"


## Use the RSQLite package to read the .pgo chess content into a data
## frame
con <- dbConnect(drv=RSQLite::SQLite(), dbname=yourPGOfile)
openingGuide <- dbGetQuery(conn=con,
                           statement="SELECT * FROM 'GUIDE'")

## We're only going to work with a couple of columns
openingGuide <- openingGuide[,c("XPV","PV")]

## Determine which moves in the guide are terminal (end of a line)
allXPV <- openingGuide$XPV

isEndOfLine <- function(x) {
               FUN=function(y){grepl(x,y,fixed=TRUE) & !(x==y)}))

openingGuide$isEndOfLine <- sapply(openingGuide$XPV,FUN=isEndOfLine)
endsOfLines <- openingGuide[openingGuide$isEndOfLine>0,]$XPV

## For each terminal move, reconstruct the line that leads to it, in
## long algebraic notation, and add a "*" character at the end to
## separate the lines as "games" in a file
allLines <- character()

for (j in 1:length(endsOfLines)) {
  aux <- openingGuide
  aux$includeInLine <- sapply(openingGuide$XPV, FUN=function(x) {
  aux <- aux[aux$includeInLine,]
  aux <- aux[with(aux,order(XPV)),]
  allLines[[j]] <- paste(paste(aux$PV,collapse=' '),'*',sep=' ')

## Write the lines/games to an output file, to feed into pgn-extract
writeLines(allLines, fileConn)

Barnes’ pgn-extract utility can then convert intermediate_output.txt into a standard .pgn file final_output.pgn as follows:

pgn-extract --output final_output.pgn intermediate_output.txt

That’s all there is to it, as long as you are after just the moves, and not, say, textual comments or NAGs that were included in the opening guide. Those could be had too, by enhancing the R script above to use appropriately the NAG and COMMENT columns that are part of the GUIDE table, but I leave that to any interested party.

Dr. Rabinowitz

About six years ago, I found myself waiting in the exam room of my doctor’s office, and enjoying a delightfully odd painting that was hanging on the wall. With nothing else to do while I waited, I snapped a picture of it, and that found its way to my dropbox. This morning I had the unfortunate occasion to remember this unusual painting, as I learned that Dr. Jerry Rabinowitz was one of the eleven victims of Saturday’s massacre at the Tree of Life synagogue.

I didn’t know him well at all, and only saw him a handful of times in his professional capacity. But my impression of the man was one of intelligence, warmth, good humor, and a sincere dedication to his patients. That such a man would be gunned down in so senseless and hateful an act as this weekend’s horror is tragic. I aim to do what little I can to make such events less likely going forward.

From a misplayed attack to a positional exchange sac

Looking at another old game I played against a fellow local club player. I played the opening in an unorthodox and clumsy way, but ended up with the chance to play a very straightforward kingside attack. At the chance to pull the trigger, I saw a ghost and bailed prematurely. On the plus side, this led to playing an interesting positional exchange sacrifice, which ultimately netted a win anyway.

“They can only take them one at a time” — Mikhail Tal

This is a game from several years ago that I played against local NM Franklin Chen. From the black side, I got myself into a cramped and delicate position. At one point, I had to find a defensive move that left 3 of my own pieces en prise, which solves all the problems I’d otherwise be facing on the board. It felt like the defensive flip side to Tal’s quip when launching attacks that involved offering multiple piece sacrifices, “They can only take them one at a time.”

After that point in the game, the tables turned a bit, and I was likely winning eventually. But I didn’t make the most of my chances, and things reached a peaceful end.

Category-theoretic view of SQL

David Spivak has given a nice, category-theoretic way of looking at certain elements of data analysis, and here I’m just going to spell out a few of the basic ideas for myself, comparing a handful of concrete SQL examples to the abstract machinery Spivak lays out. I might delve into more bits and pieces at a later date.

As I was going through his paper, I found myself thinking that these formulations of schemas and tables as category-theoretic structures could actually be practically useful, say in making more robust and manageable/trackable entity relationship models than what might exist out there right now. As it turns out, Spivak (along with others like Ryan Wisnesky) does indeed appear to be working on turning these ideas into usable tools, as evidenced by their Algebraic Query Language and a related commercial endeavor. It really seems like an interesting theoretical direction math-wise, and a promising possibility for use in actual data-analytic work.

The schema for a table is a specification of its column names, along with data type assignments for each column:

-- schema definition for a table 'my_table'
create table my_table (
   col1  integer
  ,col2  char(1)

Describing the schema mathematically, then, it consists of an ordered set C := \langle col1, col2\rangle of column names, along with a function \sigma : C \rightarrow \mathbf{DT} that assigns each column in the list to a SQL data type. (So we’re using \mathbf{DT} to denote the set of all data types available in SQL, like integer, char(1), varchar(27), etc.) To fix ideas, for the rest of this post I’ll assume that \mathbf{DT} actually consists of just the two types from our schema above: that we have only an integer type, and a single-character string type.

Now let’s start to get a little more abstract with our schema. We’ll let U be the infinite set of all objects that are of either type in \mathbf{DT}; that is, U is the union of the set \mathbb{Z} of integers, and the set of all single-character strings from our alphabet. Basically, U is the universe of all values that a cell of a data table in our world (with its limited \mathbf{DT} set) might be populated by.

Let’s let \pi : U \rightarrow \mathbf{DT} be the function that maps each object to its type, i.e. that maps each integer to the integer type and each single-character string to the char(1) type. Between our schema function \sigma and this type assignment \pi, we have the following picture:

\begin{CD} @. U \\ @. @V{\pi}VV \\ C @>{\sigma}>> \mathbf{DT} \end{CD}

To complete this picture a little bit, we want to add another idealization; we have the universe U of all values that could possibly go into a cell in some table, and we’d now like to craft the universe U_\sigma of all cell value assignments that could appear in a table that has schema \sigma. The full Cartesian product C\times U is overly inclusive, as we would have elements like \langle col2, 35\rangle, where an integer value has been erroneously assigned to col2. The right specification of the \sigma universe is

U_\sigma := C \times_\mathbf{DT} U = \{\langle c,u\rangle \in C\times U : \sigma(c)=\pi(u)\}

That is, it’s the most inclusive subset of C\times U that still respects the type assignments of the schema in questions, i.e. such that the following diagram commutes (where the unlabeled maps are the respective projection maps from U_\sigma to each of C and U):

\begin{CD} U_\sigma @>>> U \\ @VVV @V{\pi}VV \\ C @>{\sigma}>> \mathbf{DT} \end{CD}

Category-theoretically speaking, this U_\sigma is the pullback of \pi along \sigma.

Fixing the universal type assignment \pi, Spivak notes that with an appropriate notion of morphism between schemas \sigma : C \rightarrow \mathbf{DT} and \sigma' : C' \rightarrow \mathbf{DT}, we have a category of schemas of type \pi. Next up will be to build on this to form a category of data tables …